Merge branch 'master' into docs/api-release
commit
08f408d542
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
title: InfluxDB Enterprise 1.10 documentation
|
||||
description: >
|
||||
Documentation for InfluxDB Enterprise, which adds clustering, high availability, fine-grained authorization, and more to InfluxDB OSS.
|
||||
aliases:
|
||||
- /enterprise/v1.10/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: InfluxDB Enterprise v1.10
|
||||
weight: 1
|
||||
---
|
||||
|
||||
InfluxDB Enterprise provides a time series database designed to handle high write and query loads and offers highly scalable clusters on your infrastructure with a management UI. Use for DevOps monitoring, IoT sensor data, and real-time analytics. Check out the key features that make InfluxDB Enterprise a great choice for working with time series data.
|
||||
|
||||
If you're interested in working with InfluxDB Enterprise, visit
|
||||
[InfluxPortal](https://portal.influxdata.com/) to sign up, get a license key,
|
||||
and get started!
|
||||
|
||||
## Key features
|
||||
|
||||
- High performance datastore written specifically for time series data. High ingest speed and data compression.
|
||||
- Provides high availability across your cluster and eliminates a single point of failure.
|
||||
- Written entirely in Go. Compiles into a single binary with no external dependencies.
|
||||
- Simple, high performing write and query HTTP APIs.
|
||||
- Plugin support for other data ingestion protocols such as Graphite, collectd, and OpenTSDB.
|
||||
- Expressive SQL-like query language tailored to easily query aggregated data.
|
||||
- Continuous queries automatically compute aggregate data to make frequent queries more efficient.
|
||||
- Tags let you index series for fast and efficient queries.
|
||||
- Retention policies efficiently auto-expire stale data.
|
||||
|
||||
## Next steps
|
||||
|
||||
- [Install and deploy](/enterprise_influxdb/v1.10/introduction/installation/)
|
||||
- Review key [concepts](/enterprise_influxdb/v1.10/concepts/)
|
||||
- [Get started](/enterprise_influxdb/v1.10/introduction/getting-started/)
|
||||
|
||||
<!-- Monitor your cluster
|
||||
- Manage queries
|
||||
- Manage users
|
||||
- Explore and visualize your data
|
||||
-->
|
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
title: About the project
|
||||
description: >
|
||||
Release notes, licenses, and third-party software details for InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10_ref:
|
||||
weight: 10
|
||||
---
|
||||
|
||||
{{< children hlevel="h2" >}}
|
||||
|
||||
## Commercial license
|
||||
|
||||
InfluxDB Enterprise is available with a commercial license. [Contact sales for more information](https://www.influxdata.com/contact-sales/).
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,51 @@
|
|||
---
|
||||
title: Third party software
|
||||
description: >
|
||||
InfluxData products contain third-party software that is copyrighted,
|
||||
patented, or otherwise legally protected software of third parties
|
||||
incorporated in InfluxData products.
|
||||
menu:
|
||||
enterprise_influxdb_1_10_ref:
|
||||
name: Third party software
|
||||
weight: 20
|
||||
parent: About the project
|
||||
---
|
||||
|
||||
InfluxData products contain third party software, which means the copyrighted,
|
||||
patented, or otherwise legally protected software of third parties that is
|
||||
incorporated in InfluxData products.
|
||||
|
||||
Third party suppliers make no representation nor warranty with respect to
|
||||
such third party software or any portion thereof.
|
||||
Third party suppliers assume no liability for any claim that might arise with
|
||||
respect to such third party software, nor for a
|
||||
customer’s use of or inability to use the third party software.
|
||||
|
||||
InfluxDB Enterprise 1.10 includes the following third party software components, which are maintained on a version by version basis.
|
||||
|
||||
| Component | License | Integration |
|
||||
| :-------- | :-------- | :-------- |
|
||||
| [ASN1 BER Encoding / Decoding Library for the GO programming language (go-asn1-ber/ans1-ber)](https://github.com/go-asn1-ber/asn1-ber) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [Cobra is a commander for modern Go CLI interactions (spf13/cobra)](https://github.com/spf13/cobra) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
|
||||
| [A golang registry for global request variables (gorilla/context)](https://github.com/gorilla/context) | [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
|
||||
| [FlatBuffers: Memory Efficient Serialization Library (google/flatbuffers)](https://github.com/google/flatbuffers) | [Apache License 2.0](https://opensource.org/licenses/Apache-2.0) | Statically linked |
|
||||
| [Flux is a lightweight scripting language for querying databases (like InfluxDB) and working with data (influxdata/flux)](https://github.com/influxdata/flux) | [Apache License 2.0](https://opensource.org/licenses/Apache-2.0) | Statically linked |
|
||||
| [GoConvey is a yummy Go testing tool for gophers (glycerine/goconvey)](https://github.com/glycerine/goconvey) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [An immutable radix tree implementation in Golang (hashicorp/go-immutable-radix)](https://github.com/hashicorp/go-immutable-radix)| [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [Some helpful packages for writing Go apps (markbates/going)](https://github.com/markbates/going)| [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [Golang LRU cache implements a fixed-size thread safe LRU cache (hashicorp/golang-lru)](https://github.com/hashicorp/golang-lru) |[Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [Codec - a high performance and feature-rich Idiomatic encode/decode and rpc library for msgpack and Binc (hashicorp/go-msgpack)](https://github.com/hashicorp/go-msgpack)| [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
|
||||
| [A Golang library for exporting performance and runtime metrics to external metrics systems, i.e. statsite, statsd (armon/go-metrics)](https://github.com/armon/go-metrics) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [Generates UUID-format strings using purely high quality random bytes (hashicorp/go-uuid)](https://github.com/hashicorp/go-uuid) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [Collection of useful handlers for Go net/http package (gorilla/handlers)](https://github.com/gorilla/handlers) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
|
||||
| [Golang implementation of JavaScript Object (dvsekhvalnov/jose2go)](https://github.com/dvsekhvalnov/jose2go) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [Basic LDAP v3 functionality for the Go programming language (go-ldap/ldap)](https://github.com/go-ldap/ldap) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [Basic LDAP v3 functionality for the Go programming language (mark-rushakoff/ldapserver)](https://github.com/mark-rushakoff/ldapserver) | [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
|
||||
| [A powerful URL router and dispatcher for golang (gorilla/mux)](https://github.com/gorilla/mux) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
|
||||
| [pkcs7 implements parsing and creating signed and enveloped messages (fullsailor/pkcs7)](https://github.com/fullsailor/pkcs7) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [Pretty printing for Go values (kr/pretty)](https://github.com/kr/pretty) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|
||||
|[Go language implementation of the Raft consensus protocol (hashicorp/raft)](https://github.com/hashicorp/raft) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [Raft backend implementation using BoltDB (hashicorp/raft-boltdb)](https://github.com/hashicorp/raft-boltdb) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [General purpose extensions to golang's database/sql (jmoiron/sqlx)](https://github.com/jmoiron/sqlx) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|
||||
| [Miscellaneous functions for formatting text (kr/text)](https://github.com/kr/text) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [Golang connection multiplexing library (hashicorp/yamux)](https://github.com/hashicorp/yamux/) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
|
@ -0,0 +1,10 @@
|
|||
---
|
||||
title: Administer InfluxDB Enterprise
|
||||
description: Configuration, security, and logging in InfluxDB enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Administration
|
||||
weight: 70
|
||||
---
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,515 @@
|
|||
---
|
||||
title: Back up and restore
|
||||
description: >
|
||||
Back up and restore InfluxDB enterprise clusters to prevent data loss.
|
||||
aliases:
|
||||
- /enterprise/v1.10/guides/backup-and-restore/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Back up and restore
|
||||
weight: 10
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Backup and restore utilities](#backup-and-restore-utilities)
|
||||
- [Exporting and importing data](#exporting-and-importing-data)
|
||||
|
||||
## Overview
|
||||
|
||||
When deploying InfluxDB Enterprise in production environments, you should have a strategy and procedures for backing up and restoring your InfluxDB Enterprise clusters to be prepared for unexpected data loss.
|
||||
|
||||
The tools provided by InfluxDB Enterprise can be used to:
|
||||
|
||||
- Provide disaster recovery due to unexpected events
|
||||
- Migrate data to new environments or servers
|
||||
- Restore clusters to a consistent state
|
||||
- Debugging
|
||||
|
||||
Depending on the volume of data to be protected and your application requirements, InfluxDB Enterprise offers two methods, described below, for managing backups and restoring data:
|
||||
|
||||
- [Backup and restore utilities](#backup-and-restore-utilities) — For most applications
|
||||
- [Exporting and importing data](#exporting-and-importing-data) — For large datasets
|
||||
|
||||
> **Note:** Use the [`backup` and `restore` utilities (InfluxDB OSS 1.5 and later)](/enterprise_influxdb/v1.10/administration/backup-and-restore/) to:
|
||||
>
|
||||
> - Restore InfluxDB Enterprise backup files to InfluxDB OSS instances.
|
||||
> - Back up InfluxDB OSS data that can be restored in InfluxDB Enterprise clusters.
|
||||
|
||||
## Backup and restore utilities
|
||||
|
||||
InfluxDB Enterprise supports backing up and restoring data in a cluster,
|
||||
a single database and retention policy, and single shards.
|
||||
Most InfluxDB Enterprise applications can use the backup and restore utilities.
|
||||
|
||||
Use the `backup` and `restore` utilities to back up and restore between `influxd`
|
||||
instances with the same versions or with only minor version differences.
|
||||
For example, you can backup from {{< latest-patch version="1.10" >}} and restore on {{< latest-patch >}}.
|
||||
|
||||
### Backup utility
|
||||
|
||||
A backup creates a copy of the [metastore](/enterprise_influxdb/v1.10/concepts/glossary/#metastore) and [shard](/enterprise_influxdb/v1.10/concepts/glossary/#shard) data at that point in time and stores the copy in the specified directory.
|
||||
|
||||
Or, back up **only the cluster metastore** using the `-strategy only-meta` backup option. For more information, see [perform a metastore only backup](#perform-a-metastore-only-backup).
|
||||
|
||||
All backups include a manifest, a JSON file describing what was collected during the backup.
|
||||
The filenames reflect the UTC timestamp of when the backup was created, for example:
|
||||
|
||||
- Metastore backup: `20060102T150405Z.meta` (includes usernames and passwords)
|
||||
- Shard data backup: `20060102T150405Z.<shard_id>.tar.gz`
|
||||
- Manifest: `20060102T150405Z.manifest`
|
||||
|
||||
Backups can be full, metastore only, or incremental, and they are incremental by default:
|
||||
|
||||
- **Full backup**: Creates a copy of the metastore and shard data.
|
||||
- **Incremental backup**: Creates a copy of the metastore and shard data that have changed since the last incremental backup. If there are no existing incremental backups, the system automatically performs a complete backup.
|
||||
- **Metastore only backup**: Creates a copy of the metastore data only.
|
||||
|
||||
Restoring different types of backups requires different syntax.
|
||||
To prevent issues with [restore](#restore-utility), keep full backups, metastore only backups, and incremental backups in separate directories.
|
||||
|
||||
>**Note:** The backup utility copies all data through the meta node that is used to
|
||||
execute the backup. As a result, performance of a backup and restore is typically limited by the network IO of the meta node. Increasing the resources available to this meta node (such as resizing the EC2 instance) can significantly improve backup and restore performance.
|
||||
|
||||
#### Syntax
|
||||
|
||||
```bash
|
||||
influxd-ctl [global-options] backup [backup-options] <path-to-backup-directory>
|
||||
```
|
||||
|
||||
> **Note:** The `influxd-ctl backup` command exits with `0` for success and `1` for failure. If the backup fails, output can be directed to a log file to troubleshoot.
|
||||
|
||||
##### Global options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#global-options)
|
||||
for a complete list of the global `influxd-ctl` options.
|
||||
|
||||
##### Backup options
|
||||
|
||||
- `-db <string>`: name of the single database to back up
|
||||
- `-from <TCP-address>`: the data node TCP address to prefer when backing up
|
||||
- `-strategy`: select the backup strategy to apply during backup
|
||||
- `incremental`: _**(Default)**_ backup only data added since the previous backup.
|
||||
- `full` perform a full backup. Same as `-full`
|
||||
- `only-meta` perform a backup for meta data only: users, roles,
|
||||
databases, continuous queries, retention policies. Shards are not exported.
|
||||
- `-full`: perform a full backup. Deprecated in favour of `-strategy=full`
|
||||
- `-rp <string>`: the name of the single retention policy to back up (must specify `-db` with `-rp`)
|
||||
- `-shard <unit>`: the ID of the single shard to back up (cannot be used with `-db`)
|
||||
- `-start <timestamp>`: Include all points starting with specified timestamp (RFC3339 format). Not compatible with `-since` or `-strategy full`.
|
||||
- `-end <timestamp>`: Exclude all points after timestamp (RFC3339 format). Not compatible with `-since` or `-strategy full`.
|
||||
|
||||
### Examples
|
||||
|
||||
#### Back up a database and all retention policies
|
||||
|
||||
Store the following incremental backups in different directories.
|
||||
The first backup specifies `-db myfirstdb` and the second backup specifies
|
||||
different options: `-db myfirstdb` and `-rp autogen`.
|
||||
|
||||
```bash
|
||||
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
|
||||
|
||||
influxd-ctl backup -db myfirstdb -rp autogen ./myfirstdb-autogen-backup
|
||||
```
|
||||
#### Back up a database with a specific retention policy
|
||||
|
||||
Store the following incremental backups in the same directory.
|
||||
Both backups specify the same `-db` flag and the same database.
|
||||
|
||||
```bash
|
||||
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
|
||||
|
||||
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
|
||||
```
|
||||
#### Back up data from a specific time range
|
||||
|
||||
To back up data in a specific time range, use the `-start` and `-end` options:
|
||||
|
||||
```bash
|
||||
influxd-ctl backup -db myfirstdb ./myfirstdb-jandata -start 2022-01-01T012:00:00Z -end 2022-01-31T011:59:00Z
|
||||
|
||||
```
|
||||
#### Perform an incremental backup
|
||||
|
||||
Perform an incremental backup into the current directory with the command below.
|
||||
If there are any existing backups the current directory, the system performs an incremental backup.
|
||||
If there aren't any existing backups in the current directory, the system performs a backup of all data in InfluxDB.
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl backup .
|
||||
|
||||
# Example
|
||||
$ influxd-ctl backup .
|
||||
Backing up meta data... Done. 421 bytes transferred
|
||||
Backing up node 7ba671c7644b:8088, db telegraf, rp autogen, shard 4... Done. Backed up in 903.539567ms, 307712 bytes transferred
|
||||
Backing up node bf5a5f73bad8:8088, db _internal, rp monitor, shard 1... Done. Backed up in 138.694402ms, 53760 bytes transferred
|
||||
Backing up node 9bf0fa0c302a:8088, db _internal, rp monitor, shard 2... Done. Backed up in 101.791148ms, 40448 bytes transferred
|
||||
Backing up node 7ba671c7644b:8088, db _internal, rp monitor, shard 3... Done. Backed up in 144.477159ms, 39424 bytes transferred
|
||||
Backed up to . in 1.293710883s, transferred 441765 bytes
|
||||
$ ls
|
||||
20160803T222310Z.manifest 20160803T222310Z.s1.tar.gz 20160803T222310Z.s3.tar.gz
|
||||
20160803T222310Z.meta 20160803T222310Z.s2.tar.gz 20160803T222310Z.s4.tar.gz
|
||||
```
|
||||
|
||||
#### Perform a full backup
|
||||
|
||||
Perform a full backup into a specific directory with the command below.
|
||||
The directory must already exist.
|
||||
|
||||
```bash
|
||||
# Sytnax
|
||||
influxd-ctl backup -full <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl backup -full backup_dir
|
||||
Backing up meta data... Done. 481 bytes transferred
|
||||
Backing up node <hostname>:8088, db _internal, rp monitor, shard 1... Done. Backed up in 33.207375ms, 238080 bytes transferred
|
||||
Backing up node <hostname>:8088, db telegraf, rp autogen, shard 2... Done. Backed up in 15.184391ms, 95232 bytes transferred
|
||||
Backed up to backup_dir in 51.388233ms, transferred 333793 bytes
|
||||
$ ls backup_dir
|
||||
20170130T184058Z.manifest
|
||||
20170130T184058Z.meta
|
||||
20170130T184058Z.s1.tar.gz
|
||||
20170130T184058Z.s2.tar.gz
|
||||
```
|
||||
|
||||
#### Perform an incremental backup on a single database
|
||||
|
||||
Point at a remote meta server and back up only one database into a given directory (the directory must already exist):
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl -bind <metahost>:8091 backup -db <db-name> <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl -bind 2a1b7a338184:8091 backup -db telegraf ./telegrafbackup
|
||||
Backing up meta data... Done. 318 bytes transferred
|
||||
Backing up node 7ba671c7644b:8088, db telegraf, rp autogen, shard 4... Done. Backed up in 997.168449ms, 399872 bytes transferred
|
||||
Backed up to ./telegrafbackup in 1.002358077s, transferred 400190 bytes
|
||||
$ ls ./telegrafbackup
|
||||
20160803T222811Z.manifest 20160803T222811Z.meta 20160803T222811Z.s4.tar.gz
|
||||
```
|
||||
|
||||
#### Perform a metadata only backup
|
||||
|
||||
Perform a metadata only backup into a specific directory with the command below.
|
||||
The directory must already exist.
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl backup -strategy only-meta <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl backup -strategy only-meta backup_dir
|
||||
Backing up meta data... Done. 481 bytes transferred
|
||||
Backed up to backup_dir in 51.388233ms, transferred 481 bytes
|
||||
~# ls backup_dir
|
||||
20170130T184058Z.manifest
|
||||
20170130T184058Z.meta
|
||||
```
|
||||
|
||||
### Restore utility
|
||||
|
||||
#### Disable anti-entropy (AE) before restoring a backup
|
||||
|
||||
> Before restoring a backup, stop the anti-entropy (AE) service (if enabled) on **each data node in the cluster, one at a time**.
|
||||
|
||||
>
|
||||
> 1. Stop the `influxd` service.
|
||||
> 2. Set `[anti-entropy].enabled` to `false` in the influx configuration file (by default, influx.conf).
|
||||
> 3. Restart the `influxd` service and wait for the data node to receive read and write requests and for the [hinted handoff queue](/enterprise_influxdb/v1.10/concepts/clustering/#hinted-handoff) to drain.
|
||||
> 4. Once AE is disabled on all data nodes and each node returns to a healthy state, you're ready to restore the backup. For details on how to restore your backup, see examples below.
|
||||
> 5. After restoring the backup, restart AE services on each data node.
|
||||
|
||||
##### Restore a backup
|
||||
|
||||
Restore a backup to an existing cluster or a new cluster.
|
||||
By default, a restore writes to databases using the backed-up data's [replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor).
|
||||
An alternate replication factor can be specified with the `-newrf` flag when restoring a single database.
|
||||
Restore supports both `-full` backups and incremental backups; the syntax for
|
||||
a restore differs depending on the backup type.
|
||||
|
||||
##### Restores from an existing cluster to a new cluster
|
||||
|
||||
Restores from an existing cluster to a new cluster restore the existing cluster's
|
||||
[users](/enterprise_influxdb/v1.10/concepts/glossary/#user), roles,
|
||||
[databases](/enterprise_influxdb/v1.10/concepts/glossary/#database), and
|
||||
[continuous queries](/enterprise_influxdb/v1.10/concepts/glossary/#continuous-query-cq) to
|
||||
the new cluster.
|
||||
|
||||
They do not restore Kapacitor [subscriptions](/enterprise_influxdb/v1.10/concepts/glossary/#subscription).
|
||||
In addition, restores to a new cluster drop any data in the new cluster's
|
||||
`_internal` database and begin writing to that database anew.
|
||||
The restore does not write the existing cluster's `_internal` database to
|
||||
the new cluster.
|
||||
|
||||
#### Syntax to restore from incremental and metadata backups
|
||||
|
||||
Use the syntax below to restore an incremental or metadata backup to a new cluster or an existing cluster.
|
||||
**The existing cluster must contain no data in the affected databases.**
|
||||
Performing a restore from an incremental backup requires the path to the incremental backup's directory.
|
||||
|
||||
```bash
|
||||
influxd-ctl [global-options] restore [restore-options] <path-to-backup-directory>
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
The existing cluster can have data in the `_internal` database (the database InfluxDB creates if
|
||||
[internal monitoring](/platform/monitoring/influxdata-platform/tools/measurements-internal) is enabled).
|
||||
The system automatically drops the `_internal` database when it performs a complete restore.
|
||||
{{% /note %}}
|
||||
|
||||
##### Global options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#global-options)
|
||||
for a complete list of the global `influxd-ctl` options.
|
||||
|
||||
##### Restore options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#restore)
|
||||
for a complete list of `influxd-ctl restore` options.
|
||||
|
||||
- `-db <string>`: the name of the single database to restore
|
||||
- `-list`: shows the contents of the backup
|
||||
- `-newdb <string>`: the name of the new database to restore to (must specify with `-db`)
|
||||
- `-newrf <int>`: the new replication factor to restore to (this is capped to the number of data nodes in the cluster)
|
||||
- `-newrp <string>`: the name of the new retention policy to restore to (must specify with `-rp`)
|
||||
- `-rp <string>`: the name of the single retention policy to restore
|
||||
- `-shard <unit>`: the shard ID to restore
|
||||
|
||||
#### Syntax to restore from a full or manifest only backup
|
||||
|
||||
Use the syntax below to restore a full or manifest only backup to a new cluster or an existing cluster.
|
||||
Note that the existing cluster must contain no data in the affected databases.*
|
||||
Performing a restore requires the `-full` flag and the path to the backup's manifest file.
|
||||
|
||||
```bash
|
||||
influxd-ctl [global-options] restore [options] -full <path-to-manifest-file>
|
||||
```
|
||||
|
||||
\* The existing cluster can have data in the `_internal` database, the database
|
||||
that the system creates by default.
|
||||
The system automatically drops the `_internal` database when it performs a
|
||||
complete restore.
|
||||
|
||||
##### Global options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#global-options)
|
||||
for a complete list of the global `influxd-ctl` options.
|
||||
|
||||
##### Restore options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.10/tools/influxd-ctl/#restore)
|
||||
for a complete list of `influxd-ctl restore` options.
|
||||
|
||||
- `-db <string>`: the name of the single database to restore
|
||||
- `-list`: shows the contents of the backup
|
||||
- `-newdb <string>`: the name of the new database to restore to (must specify with `-db`)
|
||||
- `-newrf <int>`: the new replication factor to restore to (this is capped to the number of data nodes in the cluster)
|
||||
- `-newrp <string>`: the name of the new retention policy to restore to (must specify with `-rp`)
|
||||
- `-rp <string>`: the name of the single retention policy to restore
|
||||
- `-shard <unit>`: the shard ID to restore
|
||||
|
||||
#### Examples
|
||||
|
||||
##### Restore from an incremental backup
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 2...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 2 in 61.046571ms, 588800 bytes transferred
|
||||
Restored from my-incremental-backup/ in 83.892591ms, transferred 588800 bytes
|
||||
```
|
||||
|
||||
##### Restore from a metadata backup
|
||||
|
||||
In this example, the `restore` command restores a [metadata backup](#perform-a-metadata-only-backup)
|
||||
stored in the `metadata-backup/` directory.
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore metadata-backup/
|
||||
Using backup directory: metadata-backup/
|
||||
Using meta backup: 20200101T000000Z.meta
|
||||
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
|
||||
Restored from my-incremental-backup/ in 19.2311ms, transferred 588 bytes
|
||||
```
|
||||
|
||||
##### Restore from a `-full` backup
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore -full <path-to-manifest-file>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore -full my-full-backup/20170131T020341Z.manifest
|
||||
Using manifest: my-full-backup/20170131T020341Z.manifest
|
||||
Restoring meta data... Done. Restored in 9.585639ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 2...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 2 in 48.095082ms, 569344 bytes transferred
|
||||
Restored from my-full-backup in 58.58301ms, transferred 569344 bytes
|
||||
```
|
||||
|
||||
##### Restore from an incremental backup for a single database and give the database a new name
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore -db <src> -newdb <dest> <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore -db telegraf -newdb restored_telegraf my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Done. Restored in 8.119655ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 4...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 4 in 57.89687ms, 588800 bytes transferred
|
||||
Restored from my-incremental-backup/ in 66.715524ms, transferred 588800 bytes
|
||||
```
|
||||
|
||||
##### Restore from an incremental backup for a database and merge that database into an existing database
|
||||
|
||||
Your `telegraf` database was mistakenly dropped, but you have a recent backup so you've only lost a small amount of data.
|
||||
|
||||
If Telegraf is still running, it will recreate the `telegraf` database shortly after the database is dropped.
|
||||
You might try to directly restore your `telegraf` backup just to find that you can't restore:
|
||||
|
||||
```bash
|
||||
$ influxd-ctl restore -db telegraf my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Error.
|
||||
restore: operation exited with error: problem setting snapshot: database already exists
|
||||
```
|
||||
|
||||
To work around this, you can restore your telegraf backup into a new database by specifying the `-db` flag for the source and the `-newdb` flag for the new destination:
|
||||
|
||||
```bash
|
||||
$ influxd-ctl restore -db telegraf -newdb restored_telegraf my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Done. Restored in 19.915242ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 7...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 7 in 36.417682ms, 588800 bytes transferred
|
||||
Restored from my-incremental-backup/ in 56.623615ms, transferred 588800 bytes
|
||||
```
|
||||
|
||||
Then, in the [`influx` client](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/), use an [`INTO` query](/enterprise_influxdb/v1.10/query_language/explore-data/#the-into-clause) to copy the data from the new database into the existing `telegraf` database:
|
||||
|
||||
```bash
|
||||
$ influx
|
||||
> USE restored_telegraf
|
||||
Using database restored_telegraf
|
||||
> SELECT * INTO telegraf..:MEASUREMENT FROM /.*/ GROUP BY *
|
||||
name: result
|
||||
------------
|
||||
time written
|
||||
1970-01-01T00:00:00Z 471
|
||||
```
|
||||
|
||||
##### Restore (overwrite) metadata from a full or incremental backup to fix damaged metadata
|
||||
|
||||
1. Identify a backup with uncorrupted metadata from which to restore.
|
||||
2. Restore from backup with `-meta-only-overwrite-force`.
|
||||
|
||||
{{% warn %}}
|
||||
Only use the `-meta-only-overwrite-force` flag to restore from backups of the target cluster.
|
||||
If you use this flag with metadata from a different cluster, you will lose data.
|
||||
(since metadata includes shard assignments to data nodes).
|
||||
{{% /warn %}}
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore -meta-only-overwrite-force <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore -meta-only-overwrite-force my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20200101T000000Z.meta
|
||||
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
|
||||
Restored from my-incremental-backup/ in 19.2311ms, transferred 588 bytes
|
||||
```
|
||||
|
||||
#### Common issues with restore
|
||||
|
||||
##### Restore writes information not part of the original backup
|
||||
|
||||
If a [restore from an incremental backup](#syntax-to-restore-from-incremental-and-metadata-backups)
|
||||
does not limit the restore to the same database, retention policy, and shard specified by the backup command,
|
||||
the restore may appear to restore information that was not part of the original backup.
|
||||
Backups consist of a shard data backup and a metastore backup.
|
||||
The **shard data backup** contains the actual time series data: the measurements, tags, fields, and so on.
|
||||
The **metastore backup** contains user information, database names, retention policy names, shard metadata, continuous queries, and subscriptions.
|
||||
|
||||
When the system creates a backup, the backup includes:
|
||||
|
||||
* the relevant shard data determined by the specified backup options
|
||||
* all of the metastore information in the cluster regardless of the specified backup options
|
||||
|
||||
Because a backup always includes the complete metastore information, a restore that doesn't include the same options specified by the backup command may appear to restore data that were not targeted by the original backup.
|
||||
The unintended data, however, include only the metastore information, not the shard data associated with that metastore information.
|
||||
|
||||
##### Restore a backup created prior to version 1.2.0
|
||||
|
||||
InfluxDB Enterprise introduced incremental backups in version 1.2.0.
|
||||
To restore a backup created prior to version 1.2.0, be sure to follow the syntax
|
||||
for [restoring from a full backup](#restore-from-a-full-backup).
|
||||
|
||||
## Exporting and importing data
|
||||
|
||||
For most InfluxDB Enterprise applications, the [backup and restore utilities](#backup-and-restore-utilities) provide the tools you need for your backup and restore strategy. However, in some cases, the standard backup and restore utilities may not adequately handle the volumes of data in your application.
|
||||
|
||||
As an alternative to the standard backup and restore utilities, use the InfluxDB `influx_inspect export` and `influx -import` commands to create backup and restore procedures for your disaster recovery and backup strategy. These commands can be executed manually or included in shell scripts that run the export and import operations at scheduled intervals (example below).
|
||||
|
||||
### Exporting data
|
||||
|
||||
Use the [`influx_inspect export` command](/enterprise_influxdb/v1.10/tools/influx_inspect#export) to export data in line protocol format from your InfluxDB Enterprise cluster. Options include:
|
||||
|
||||
- Exporting all, or specific, databases
|
||||
- Filtering with starting and ending timestamps
|
||||
- Using gzip compression for smaller files and faster exports
|
||||
|
||||
For details on optional settings and usage, see [`influx_inspect export` command](/enterprise_influxdb/v1.10/tools/influx_inspect#export).
|
||||
|
||||
In the following example, the database is exported filtered to include only one day and compressed for optimal speed and file size.
|
||||
|
||||
```bash
|
||||
influx_inspect export \
|
||||
-database myDB \
|
||||
-compress \
|
||||
-start 2019-05-19T00:00:00.000Z \
|
||||
-end 2019-05-19T23:59:59.999Z
|
||||
```
|
||||
|
||||
### Importing data
|
||||
|
||||
After exporting the data in line protocol format, you can import the data using the [`influx -import` CLI command](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/#import-data-from-a-file-with--import).
|
||||
|
||||
In the following example, the compressed data file is imported into the specified database.
|
||||
|
||||
```bash
|
||||
influx -import -database myDB -compressed
|
||||
```
|
||||
|
||||
For details on using the `influx -import` command, see [Import data from a file with -import](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/#import-data-from-a-file-with--import).
|
||||
|
||||
### Example
|
||||
|
||||
For an example of using the exporting and importing data approach for disaster recovery, see the Capital One presentation from Influxdays 2019 on ["Architecting for Disaster Recovery."](https://www.youtube.com/watch?v=LyQDhSdnm4A). In this presentation, Capital One discusses the following:
|
||||
|
||||
- Exporting data every 15 minutes from an active cluster to an AWS S3 bucket.
|
||||
- Replicating the export file in the S3 bucket using the AWS S3 copy command.
|
||||
- Importing data every 15 minutes from the AWS S3 bucket to a cluster available for disaster recovery.
|
||||
- Advantages of the export-import approach over the standard backup and restore utilities for large volumes of data.
|
||||
- Managing users and scheduled exports and imports with a custom administration tool.
|
|
@ -0,0 +1,11 @@
|
|||
---
|
||||
title: Configure
|
||||
description: Configure cluster and node settings in InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure
|
||||
weight: 11
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,349 @@
|
|||
---
|
||||
title: Use Anti-Entropy service in InfluxDB Enterprise
|
||||
description: The Anti-Entropy service monitors and repairs shards in InfluxDB.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/Anti-Entropy/
|
||||
- /enterprise_influxdb/v1.10/administration/anti-entropy/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Use Anti-entropy service
|
||||
parent: Configure
|
||||
weight: 50
|
||||
---
|
||||
|
||||
{{% warn %}}
|
||||
Prior to InfluxDB Enterprise 1.7.2, the Anti-Entropy (AE) service was enabled by default. When shards create digests with lots of time ranges (10s of thousands), some customers have experienced significant performance issues, including CPU usage spikes. If your shards include a small number of time ranges (most have 1 to 10, some have up to several hundreds) and you can benefit from the AE service, enable AE and monitor it closely to see if your performance is adversely impacted.
|
||||
{{% /warn %}}
|
||||
|
||||
## Introduction
|
||||
|
||||
Shard entropy refers to inconsistency among shards in a shard group.
|
||||
This can be due to the "eventually consistent" nature of data stored in InfluxDB
|
||||
Enterprise clusters or due to missing or unreachable shards.
|
||||
The Anti-Entropy (AE) service ensures that each data node has all the shards it
|
||||
owns according to the metastore and that all shards in a shard group are consistent.
|
||||
Missing shards are automatically repaired without operator intervention while
|
||||
out-of-sync shards can be manually queued for repair.
|
||||
This topic covers how the Anti-Entropy service works and some of the basic situations where it takes effect.
|
||||
|
||||
## Concepts
|
||||
|
||||
The Anti-Entropy service is a component of the `influxd` service available on each of your data nodes. Use this service to ensure that each data node has all of the shards that the metastore says it owns and ensure all shards in a shard group are in sync.
|
||||
If any shards are missing, the Anti-Entropy service will copy existing shards from other shard owners.
|
||||
If data inconsistencies are detected among shards in a shard group, [invoke the Anti-Entropy service](#command-line-tools-for-managing-entropy) and queue the out-of-sync shards for repair.
|
||||
In the repair process, the Anti-Entropy service will sync the necessary updates from other shards
|
||||
within a shard group.
|
||||
|
||||
By default, the service performs consistency checks every 5 minutes. This interval can be modified in the [`anti-entropy.check-interval`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#check-interval) configuration setting.
|
||||
|
||||
The Anti-Entropy service can only address missing or inconsistent shards when
|
||||
there is at least one copy of the shard available.
|
||||
In other words, as long as new and healthy nodes are introduced, a replication
|
||||
factor of 2 can recover from one missing or inconsistent node;
|
||||
a replication factor of 3 can recover from two missing or inconsistent nodes, and so on.
|
||||
A replication factor of 1, which is not recommended, cannot be recovered by the Anti-Entropy service.
|
||||
|
||||
## Symptoms of entropy
|
||||
|
||||
The Anti-Entropy service automatically detects and fixes missing shards, but shard inconsistencies
|
||||
must be [manually detected and queued for repair](#detecting-and-repairing-entropy).
|
||||
There are symptoms of entropy that, if seen, would indicate an entropy repair is necessary.
|
||||
|
||||
### Different results for the same query
|
||||
|
||||
When running queries against an InfluxDB Enterprise cluster, each query may be routed to a different data node.
|
||||
If entropy affects data within the queried range, the same query will return different
|
||||
results depending on which node the query runs against.
|
||||
|
||||
_**Query attempt 1**_
|
||||
|
||||
```sql
|
||||
SELECT mean("usage_idle") WHERE time > '2018-06-06T18:00:00Z' AND time < '2018-06-06T18:15:00Z' GROUP BY time(3m) FILL(0)
|
||||
|
||||
name: cpu
|
||||
time mean
|
||||
---- ----
|
||||
1528308000000000000 99.11867392974537
|
||||
1528308180000000000 99.15410822137049
|
||||
1528308360000000000 99.14927494363032
|
||||
1528308540000000000 99.1980535465783
|
||||
1528308720000000000 99.18584290492262
|
||||
```
|
||||
|
||||
_**Query attempt 2**_
|
||||
|
||||
```sql
|
||||
SELECT mean("usage_idle") WHERE time > '2018-06-06T18:00:00Z' AND time < '2018-06-06T18:15:00Z' GROUP BY time(3m) FILL(0)
|
||||
|
||||
name: cpu
|
||||
time mean
|
||||
---- ----
|
||||
1528308000000000000 99.11867392974537
|
||||
1528308180000000000 0
|
||||
1528308360000000000 0
|
||||
1528308540000000000 0
|
||||
1528308720000000000 99.18584290492262
|
||||
```
|
||||
|
||||
The results indicate that data is missing in the queried time range and entropy is present.
|
||||
|
||||
### Flapping dashboards
|
||||
|
||||
A "flapping" dashboard means data visualizations change when data is refreshed
|
||||
and pulled from a node with entropy (inconsistent data).
|
||||
It is the visual manifestation of getting [different results from the same query](#different-results-for-the-same-query).
|
||||
|
||||
<img src="/img/enterprise/1-6-flapping-dashboard.gif" alt="Flapping dashboard" style="width:100%; max-width:800px">
|
||||
|
||||
## Technical details
|
||||
|
||||
### Detecting entropy
|
||||
|
||||
The Anti-Entropy service runs on each data node and periodically checks its shards' statuses
|
||||
relative to the next data node in the ownership list.
|
||||
The service creates a "digest" or summary of data in the shards on the node.
|
||||
|
||||
For example, assume there are two data nodes in your cluster: `node1` and `node2`.
|
||||
Both `node1` and `node2` own `shard1` so `shard1` is replicated across each.
|
||||
|
||||
When a status check runs, `node1` will ask `node2` when `shard1` was last modified.
|
||||
If the reported modification time differs from the previous check, then
|
||||
`node1` asks `node2` for a new digest of `shard1`, checks for differences (performs a "diff") between the `shard1` digest for `node2` and the local `shard1` digest.
|
||||
If a difference exists, `shard1` is flagged as having entropy.
|
||||
|
||||
### Repairing entropy
|
||||
|
||||
If during a status check a node determines the next node is completely missing a shard,
|
||||
it immediately adds the missing shard to the repair queue.
|
||||
A background routine monitors the queue and begins the repair process as new shards are added to it.
|
||||
Repair requests are pulled from the queue by the background process and repaired using a `copy shard` operation.
|
||||
|
||||
> Currently, shards that are present on both nodes but contain different data are not automatically queued for repair.
|
||||
> A user must make the request via `influxd-ctl entropy repair <shard ID>`.
|
||||
> For more information, see [Detecting and repairing entropy](#detecting-and-repairing-entropy) below.
|
||||
|
||||
Using `node1` and `node2` from the [earlier example](#detecting-entropy), `node1` asks `node2` for a digest of `shard1`.
|
||||
`node1` diffs its own local `shard1` digest and `node2`'s `shard1` digest,
|
||||
then creates a new digest containing only the differences (the diff digest).
|
||||
The diff digest is used to create a patch containing only the data `node2` is missing.
|
||||
`node1` sends the patch to `node2` and instructs it to apply it.
|
||||
Once `node2` finishes applying the patch, it queues a repair for `shard1` locally.
|
||||
|
||||
The "node-to-node" shard repair continues until it runs on every data node that owns the shard in need of repair.
|
||||
|
||||
### Repair order
|
||||
|
||||
Repairs between shard owners happen in a deterministic order.
|
||||
This doesn't mean repairs always start on node 1 and then follow a specific node order.
|
||||
Repairs are viewed at the shard level.
|
||||
Each shard has a list of owners and the repairs for a particular shard will happen
|
||||
in a deterministic order among its owners.
|
||||
|
||||
When the Anti-Entropy service on any data node receives a repair request for a shard, it determines which
|
||||
owner node is the first in the deterministic order and forwards the request to that node.
|
||||
The request is now queued on the first owner.
|
||||
|
||||
The first owner's repair processor pulls it from the queue, detects the differences
|
||||
between the local copy of the shard with the copy of the same shard on the next
|
||||
owner in the deterministic order, then generates a patch from that difference.
|
||||
The first owner then makes an RPC call to the next owner instructing it to apply
|
||||
the patch to its copy of the shard.
|
||||
|
||||
Once the next owner has successfully applied the patch, it adds that shard to the Anti-Entropy repair queue.
|
||||
A list of "visited" nodes follows the repair through the list of owners.
|
||||
Each owner will check the list to detect when the repair has cycled through all owners,
|
||||
at which point the repair is finished.
|
||||
|
||||
### Hot shards
|
||||
|
||||
The Anti-Entropy service does its best to avoid hot shards (shards that are currently receiving writes)
|
||||
because they change quickly.
|
||||
While write replication between shard owner nodes (with a
|
||||
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
|
||||
greater than 1) typically happens in milliseconds, this slight difference is
|
||||
still enough to cause the appearance of entropy where there is none.
|
||||
|
||||
Because the Anti-Entropy service repairs only cold shards, unexpected effects can occur.
|
||||
Consider the following scenario:
|
||||
|
||||
1. A shard goes cold.
|
||||
2. Anti-Entropy detects entropy.
|
||||
3. Entropy is reported by the [Anti-Entropy `/status` API](/enterprise_influxdb/v1.10/administration/anti-entropy-api/#get-status) or with the `influxd-ctl entropy show` command.
|
||||
4. Shard takes a write, gets compacted, or something else causes it to go hot.
|
||||
_These actions are out of Anti-Entropy's control._
|
||||
5. A repair is requested, but is ignored because the shard is now hot.
|
||||
|
||||
In this example, you would have to periodically request a repair of the shard
|
||||
until it either shows as being in the queue, being repaired, or no longer in the list of shards with entropy.
|
||||
|
||||
## Configuration
|
||||
|
||||
The configuration settings for the Anti-Entropy service are described in [Anti-Entropy settings](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#anti-entropy-ae-settings) section of the data node configuration.
|
||||
|
||||
To enable the Anti-Entropy service, change the default value of the `[anti-entropy].enabled = false` setting to `true` in the `influxdb.conf` file of each of your data nodes.
|
||||
|
||||
## Command line tools for managing entropy
|
||||
|
||||
>**Note:** The Anti-Entropy service is disabled by default and must be enabled before using these commands.
|
||||
|
||||
The `influxd-ctl entropy` command enables you to manage entropy among shards in a cluster.
|
||||
It includes the following subcommands:
|
||||
|
||||
#### `show`
|
||||
|
||||
Lists shards that are in an inconsistent state and in need of repair as well as
|
||||
shards currently in the repair queue.
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy show
|
||||
```
|
||||
|
||||
#### `repair`
|
||||
|
||||
Queues a shard for repair.
|
||||
It requires a Shard ID which is provided in the [`show`](#show) output.
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy repair <shardID>
|
||||
```
|
||||
|
||||
Repairing entropy in a shard is an asynchronous operation.
|
||||
This command will return quickly as it only adds a shard to the repair queue.
|
||||
Queuing shards for repair is idempotent.
|
||||
There is no harm in making multiple requests to repair the same shard even if
|
||||
it is already queued, currently being repaired, or not in need of repair.
|
||||
|
||||
#### `kill-repair`
|
||||
|
||||
Removes a shard from the repair queue.
|
||||
It requires a Shard ID which is provided in the [`show`](#show) output.
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy kill-repair <shardID>
|
||||
```
|
||||
|
||||
This only applies to shards in the repair queue.
|
||||
It does not cancel repairs on nodes that are in the process of being repaired.
|
||||
Once a repair has started, requests to cancel it are ignored.
|
||||
|
||||
> Stopping a entropy repair for a **missing** shard operation is not currently supported.
|
||||
> It may be possible to stop repairs for missing shards with the
|
||||
> [`influxd-ctl kill-copy-shard`](/enterprise_influxdb/v1.10/tools/influxd-ctl/#kill-copy-shard) command.
|
||||
|
||||
## InfluxDB Anti-Entropy API
|
||||
|
||||
The Anti-Entropy service uses an API for managing and monitoring entropy.
|
||||
Details on the available API endpoints can be found in [The InfluxDB Anti-Entropy API](/enterprise_influxdb/v1.10/administration/anti-entropy-api).
|
||||
|
||||
## Use cases
|
||||
|
||||
Common use cases for the Anti-Entropy service include detecting and repairing entropy, replacing unresponsive data nodes, replacing data nodes for upgrades and maintenance, and eliminating entropy in active shards.
|
||||
|
||||
### Detecting and repairing entropy
|
||||
|
||||
Periodically, you may want to see if shards in your cluster have entropy or are
|
||||
inconsistent with other shards in the shard group.
|
||||
Use the `influxd-ctl entropy show` command to list all shards with detected entropy:
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy show
|
||||
|
||||
Entropy
|
||||
==========
|
||||
ID Database Retention Policy Start End Expires Status
|
||||
21179 statsdb 1hour 2017-10-09 00:00:00 +0000 UTC 2017-10-16 00:00:00 +0000 UTC 2018-10-22 00:00:00 +0000 UTC diff
|
||||
25165 statsdb 1hour 2017-11-20 00:00:00 +0000 UTC 2017-11-27 00:00:00 +0000 UTC 2018-12-03 00:00:00 +0000 UTC diff
|
||||
```
|
||||
|
||||
Then use the `influxd-ctl entropy repair` command to add the shards with entropy
|
||||
to the repair queue:
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy repair 21179
|
||||
|
||||
Repair Shard 21179 queued
|
||||
|
||||
influxd-ctl entropy repair 25165
|
||||
|
||||
Repair Shard 25165 queued
|
||||
```
|
||||
|
||||
Check on the status of the repair queue with the `influxd-ctl entropy show` command:
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy show
|
||||
|
||||
Entropy
|
||||
==========
|
||||
ID Database Retention Policy Start End Expires Status
|
||||
21179 statsdb 1hour 2017-10-09 00:00:00 +0000 UTC 2017-10-16 00:00:00 +0000 UTC 2018-10-22 00:00:00 +0000 UTC diff
|
||||
25165 statsdb 1hour 2017-11-20 00:00:00 +0000 UTC 2017-11-27 00:00:00 +0000 UTC 2018-12-03 00:00:00 +0000 UTC diff
|
||||
|
||||
Queued Shards: [21179 25165]
|
||||
```
|
||||
|
||||
### Replacing an unresponsive data node
|
||||
|
||||
If a data node suddenly disappears due to a catastrophic hardware failure or for any other reason, as soon as a new data node is online, the Anti-Entropy service will copy the correct shards to the new replacement node. The time it takes for the copying to complete is determined by the number of shards to be copied and how much data is stored in each.
|
||||
|
||||
_View the [Replacing Data Nodes](/enterprise_influxdb/v1.10/guides/replacing-nodes/#replace-data-nodes-in-an-influxdb-enterprise-cluster) documentation for instructions on replacing data nodes in your InfluxDB Enterprise cluster._
|
||||
|
||||
### Replacing a machine that is running a data node
|
||||
|
||||
Perhaps you are replacing a machine that is being decommissioned, upgrading hardware, or something else entirely.
|
||||
The Anti-Entropy service will automatically copy shards to the new machines.
|
||||
|
||||
Once you have successfully run the `influxd-ctl update-data` command, you are free
|
||||
to shut down the retired node without causing any interruption to the cluster.
|
||||
The Anti-Entropy process will continue copying the appropriate shards from the
|
||||
remaining replicas in the cluster.
|
||||
|
||||
### Fixing entropy in active shards
|
||||
|
||||
In rare cases, the currently active shard, or the shard to which new data is
|
||||
currently being written, may find itself with inconsistent data.
|
||||
Because the Anti-Entropy process can't write to hot shards, you must stop writes to the new
|
||||
shard using the [`influxd-ctl truncate-shards` command](/enterprise_influxdb/v1.10/tools/influxd-ctl/#truncate-shards),
|
||||
then add the inconsistent shard to the entropy repair queue:
|
||||
|
||||
```bash
|
||||
# Truncate hot shards
|
||||
influxd-ctl truncate-shards
|
||||
|
||||
# Show shards with entropy
|
||||
influxd-ctl entropy show
|
||||
|
||||
Entropy
|
||||
==========
|
||||
ID Database Retention Policy Start End Expires Status
|
||||
21179 statsdb 1hour 2018-06-06 12:00:00 +0000 UTC 2018-06-06 23:44:12 +0000 UTC 2018-12-06 00:00:00 +0000 UTC diff
|
||||
|
||||
# Add the inconsistent shard to the repair queue
|
||||
influxd-ctl entropy repair 21179
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Queued repairs are not being processed
|
||||
|
||||
The primary reason a repair in the repair queue isn't being processed is because
|
||||
it went "hot" after the repair was queued.
|
||||
The Anti-Entropy service only repairs cold shards or shards that are not currently being written to.
|
||||
If the shard is hot, the Anti-Entropy service will wait until it goes cold again before performing the repair.
|
||||
|
||||
If the shard is "old" and writes to it are part of a backfill process, you simply
|
||||
have to wait until the backfill process is finished. If the shard is the active
|
||||
shard, run `truncate-shards` to stop writes to active shards. This process is
|
||||
outlined [above](#fixing-entropy-in-active-shards).
|
||||
|
||||
### Anti-Entropy log messages
|
||||
|
||||
Below are common messages output by Anti-Entropy along with what they mean.
|
||||
|
||||
#### `Checking status`
|
||||
|
||||
Indicates that the Anti-Entropy process has begun the [status check process](#detecting-entropy).
|
||||
|
||||
#### `Skipped shards`
|
||||
|
||||
Indicates that the Anti-Entropy process has skipped a status check on shards because they are currently [hot](#hot-shards).
|
|
@ -0,0 +1,238 @@
|
|||
---
|
||||
title: InfluxDB Anti-Entropy API
|
||||
description: >
|
||||
Monitor and repair shards on InfluxDB Enterprise data nodes the InfluxDB Anti-Entropy API.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Anti-entropy API
|
||||
weight: 70
|
||||
parent: Use Anti-entropy service
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/anti-entropy-api/
|
||||
---
|
||||
|
||||
>**Note:** The Anti-Entropy API is available from the meta nodes and is only available when the Anti-Entropy service is enabled in the data node configuration settings. For information on the configuration settings, see
|
||||
> [Anti-Entropy settings](/enterprise_influxdb/v1.10/administration/config-data-nodes/#anti-entropy-ae-settings).
|
||||
|
||||
Use the [Anti-Entropy service](/enterprise_influxdb/v1.10/administration/anti-entropy) in InfluxDB Enterprise to monitor and repair entropy in data nodes and their shards. To access the Anti-Entropy API and work with this service, use [`influx-ctl entropy`](/enterprise_influxdb/v1.10/tools/influxd-ctl/#entropy) (also available on meta nodes).
|
||||
|
||||
The base URL is:
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair
|
||||
```
|
||||
|
||||
## GET `/status`
|
||||
|
||||
### Description
|
||||
|
||||
Lists shards that are in an inconsistent state and in need of repair.
|
||||
|
||||
### Parameters
|
||||
|
||||
| Name | Located in | Description | Required | Type |
|
||||
| ---- | ---------- | ----------- | -------- | ---- |
|
||||
| `local` | query | Limits status check to local shards on the data node handling this request | No | boolean |
|
||||
|
||||
### Responses
|
||||
|
||||
#### Headers
|
||||
|
||||
| Header name | Value |
|
||||
|-------------|--------------------|
|
||||
| `Accept` | `application/json` |
|
||||
|
||||
#### Status codes
|
||||
|
||||
| Code | Description | Type |
|
||||
| ---- | ----------- | ------ |
|
||||
| `200` | `Successful operation` | object |
|
||||
|
||||
### Examples
|
||||
|
||||
#### cURL request
|
||||
|
||||
```bash
|
||||
curl -X GET "http://localhost:8086/shard-repair/status?local=true" -H "accept: application/json"
|
||||
```
|
||||
|
||||
#### Request URL
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair/status?local=true
|
||||
|
||||
```
|
||||
|
||||
### Responses
|
||||
|
||||
Example of server response value:
|
||||
|
||||
```json
|
||||
{
|
||||
"shards": [
|
||||
{
|
||||
"id": "1",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "-259200000000000",
|
||||
"end_time": "345600000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
},
|
||||
{
|
||||
"id": "3",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "62640000000000000",
|
||||
"end_time": "63244800000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
}
|
||||
],
|
||||
"queued_shards": [
|
||||
"3",
|
||||
"5",
|
||||
"9"
|
||||
],
|
||||
"processing_shards": [
|
||||
"3",
|
||||
"9"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## POST `/repair`
|
||||
|
||||
### Description
|
||||
|
||||
Queues the specified shard for repair of the inconsistent state.
|
||||
|
||||
### Parameters
|
||||
|
||||
| Name | Located in | Description | Required | Type |
|
||||
| ---- | ---------- | ----------- | -------- | ---- |
|
||||
| `id` | query | ID of shard to queue for repair | Yes | integer |
|
||||
|
||||
### Responses
|
||||
|
||||
#### Headers
|
||||
|
||||
| Header name | Value |
|
||||
| ----------- | ----- |
|
||||
| `Accept` | `application/json` |
|
||||
|
||||
#### Status codes
|
||||
|
||||
| Code | Description |
|
||||
| ---- | ----------- |
|
||||
| `204` | `Successful operation` |
|
||||
| `400` | `Bad request` |
|
||||
| `500` | `Internal server error` |
|
||||
|
||||
### Examples
|
||||
|
||||
#### cURL request
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8086/shard-repair/repair?id=1" -H "accept: application/json"
|
||||
```
|
||||
|
||||
#### Request URL
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair/repair?id=1
|
||||
```
|
||||
|
||||
## POST `/cancel-repair`
|
||||
|
||||
### Description
|
||||
|
||||
Removes the specified shard from the repair queue on nodes.
|
||||
|
||||
### Parameters
|
||||
|
||||
| Name | Located in | Description | Required | Type |
|
||||
| ---- | ---------- | ----------- | -------- | ---- |
|
||||
| `id` | query | ID of shard to remove from repair queue | Yes | integer |
|
||||
| `local` | query | Only remove shard from repair queue on node receiving the request | No | boolean |
|
||||
|
||||
### Responses
|
||||
|
||||
#### Headers
|
||||
|
||||
| Header name | Value |
|
||||
|-------------|--------------------|
|
||||
| `Accept` | `application/json` |
|
||||
|
||||
#### Status codes
|
||||
|
||||
| Code | Description |
|
||||
| ---- | ----------- |
|
||||
| `204` | `Successful operation` |
|
||||
| `400` | `Bad request` |
|
||||
| `500` | `Internal server error` |
|
||||
|
||||
### Examples
|
||||
|
||||
#### cURL request
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8086/shard-repair/cancel-repair?id=1&local=false" -H "accept: application/json"
|
||||
```
|
||||
|
||||
#### Request URL
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair/cancel-repair?id=1&local=false
|
||||
```
|
||||
|
||||
## Models
|
||||
|
||||
### ShardStatus
|
||||
|
||||
| Name | Type | Required |
|
||||
| ---- | ---- | -------- |
|
||||
| `id` | string | No |
|
||||
| `database` | string | No |
|
||||
| `retention_policy` | string | No |
|
||||
| `start_time` | string | No |
|
||||
| `end_time` | string | No |
|
||||
| `expires` | string | No |
|
||||
| `status` | string | No |
|
||||
|
||||
### Examples
|
||||
|
||||
|
||||
```json
|
||||
{
|
||||
"shards": [
|
||||
{
|
||||
"id": "1",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "-259200000000000",
|
||||
"end_time": "345600000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
},
|
||||
{
|
||||
"id": "3",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "62640000000000000",
|
||||
"end_time": "63244800000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
}
|
||||
],
|
||||
"queued_shards": [
|
||||
"3",
|
||||
"5",
|
||||
"9"
|
||||
],
|
||||
"processing_shards": [
|
||||
"3",
|
||||
"9"
|
||||
]
|
||||
}
|
||||
```
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,459 @@
|
|||
---
|
||||
title: Configure InfluxDB Enterprise meta modes
|
||||
description: >
|
||||
Configure InfluxDB Enterprise data node settings and environmental variables.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure meta nodes
|
||||
parent: Configure
|
||||
weight: 30
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/config-meta-nodes/
|
||||
---
|
||||
|
||||
* [Meta node configuration settings](#meta-node-configuration-settings)
|
||||
* [Global options](#global-options)
|
||||
* [Enterprise license `[enterprise]`](#enterprise)
|
||||
* [Meta node `[meta]`](#meta)
|
||||
* [TLS `[tls]`](#tls-settings)
|
||||
|
||||
## Meta node configuration settings
|
||||
|
||||
### Global options
|
||||
|
||||
#### `reporting-disabled`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
InfluxData, the company, relies on reported data from running nodes primarily to
|
||||
track the adoption rates of different InfluxDB versions.
|
||||
These data help InfluxData support the continuing development of InfluxDB.
|
||||
|
||||
The `reporting-disabled` option toggles the reporting of data every 24 hours to
|
||||
`usage.influxdata.com`.
|
||||
Each report includes a randomly-generated identifier, OS, architecture,
|
||||
InfluxDB version, and the number of databases, measurements, and unique series.
|
||||
To disable reporting, set this option to `true`.
|
||||
|
||||
> **Note:** No data from user databases are ever transmitted.
|
||||
|
||||
#### `bind-address`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
This setting is not intended for use.
|
||||
It will be removed in future versions.
|
||||
|
||||
#### `hostname`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
The hostname of the [meta node](/enterprise_influxdb/v1.10/concepts/glossary/#meta-node).
|
||||
This must be resolvable and reachable by all other members of the cluster.
|
||||
|
||||
Environment variable: `INFLUXDB_HOSTNAME`
|
||||
|
||||
-----
|
||||
|
||||
### Enterprise license settings
|
||||
#### `[enterprise]`
|
||||
|
||||
The `[enterprise]` section contains the parameters for the meta node's
|
||||
registration with the [InfluxData portal](https://portal.influxdata.com/).
|
||||
|
||||
#### `license-key`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
The license key created for you on [InfluxData portal](https://portal.influxdata.com).
|
||||
The meta node transmits the license key to
|
||||
[portal.influxdata.com](https://portal.influxdata.com) over port 80 or port 443
|
||||
and receives a temporary JSON license file in return.
|
||||
The server caches the license file locally.
|
||||
If your server cannot communicate with [https://portal.influxdata.com](https://portal.influxdata.com), you must use the [`license-path` setting](#license-path).
|
||||
|
||||
Use the same key for all nodes in the same cluster.
|
||||
{{% warn %}}The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
|
||||
{{% /warn %}}
|
||||
|
||||
> **Note:** You must restart meta nodes to update your configuration. For more information, see how to [renew or update your license key](/enterprise_influxdb/v1.10/administration/renew-license/).
|
||||
|
||||
Environment variable: `INFLUXDB_ENTERPRISE_LICENSE_KEY`
|
||||
|
||||
#### `license-path`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
The local path to the permanent JSON license file that you received from InfluxData
|
||||
for instances that do not have access to the internet.
|
||||
To obtain a license file, contact [sales@influxdb.com](mailto:sales@influxdb.com).
|
||||
|
||||
The license file must be saved on every server in the cluster, including meta nodes
|
||||
and data nodes.
|
||||
The file contains the JSON-formatted license, and must be readable by the `influxdb` user.
|
||||
Each server in the cluster independently verifies its license.
|
||||
|
||||
{{% warn %}}
|
||||
The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
|
||||
{{% /warn %}}
|
||||
|
||||
> **Note:** You must restart meta nodes to update your configuration. For more information, see how to [renew or update your license key](/enterprise_influxdb/v1.10/administration/renew-license/).
|
||||
|
||||
Environment variable: `INFLUXDB_ENTERPRISE_LICENSE_PATH`
|
||||
|
||||
-----
|
||||
### Meta node settings
|
||||
|
||||
#### `[meta]`
|
||||
|
||||
#### `dir`
|
||||
|
||||
Default is `"/var/lib/influxdb/meta"`.
|
||||
|
||||
The directory where cluster meta data is stored.
|
||||
|
||||
Environment variable: `INFLUXDB_META_DIR`
|
||||
|
||||
#### `bind-address`
|
||||
|
||||
Default is `":8089"`.
|
||||
|
||||
The bind address(port) for meta node communication.
|
||||
For simplicity, InfluxData recommends using the same port on all meta nodes,
|
||||
but this is not necessary.
|
||||
|
||||
Environment variable: `INFLUXDB_META_BIND_ADDRESS`
|
||||
|
||||
#### `http-bind-address`
|
||||
|
||||
Default is `":8091"`.
|
||||
|
||||
The default address to bind the API to.
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTP_BIND_ADDRESS`
|
||||
|
||||
#### `https-enabled`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
Determines whether meta nodes use HTTPS to communicate with each other. By default, HTTPS is disabled. We strongly recommend enabling HTTPS.
|
||||
|
||||
To enable HTTPS, set https-enabled to `true`, specify the path to the SSL certificate `https-certificate = " "`, and specify the path to the SSL private key `https-private-key = ""`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_ENABLED`
|
||||
|
||||
#### `https-certificate`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
If HTTPS is enabled, specify the path to the SSL certificate.
|
||||
Use either:
|
||||
|
||||
* PEM-encoded bundle with both the certificate and key (`[bundled-crt-and-key].pem`)
|
||||
* Certificate only (`[certificate].crt`)
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_CERTIFICATE`
|
||||
|
||||
#### `https-private-key`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
If HTTPS is enabled, specify the path to the SSL private key.
|
||||
Use either:
|
||||
|
||||
* PEM-encoded bundle with both the certificate and key (`[bundled-crt-and-key].pem`)
|
||||
* Private key only (`[private-key].key`)
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_PRIVATE_KEY`
|
||||
|
||||
#### `https-insecure-tls`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
Whether meta nodes will skip certificate validation communicating with each other over HTTPS.
|
||||
This is useful when testing with self-signed certificates.
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_INSECURE_TLS`
|
||||
|
||||
#### `data-use-tls`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
Whether to use TLS to communicate with data nodes.
|
||||
|
||||
#### `data-insecure-tls`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
Whether meta nodes will skip certificate validation communicating with data nodes over TLS.
|
||||
This is useful when testing with self-signed certificates.
|
||||
|
||||
#### `gossip-frequency`
|
||||
|
||||
Default is `"5s"`.
|
||||
|
||||
The default frequency with which the node will gossip its known announcements.
|
||||
|
||||
#### `announcement-expiration`
|
||||
|
||||
Default is `"30s"`.
|
||||
|
||||
The default length of time an announcement is kept before it is considered too old.
|
||||
|
||||
#### `retention-autocreate`
|
||||
|
||||
Default is `true`.
|
||||
|
||||
Automatically create a default retention policy when creating a database.
|
||||
|
||||
#### `election-timeout`
|
||||
|
||||
Default is `"1s"`.
|
||||
|
||||
The amount of time in candidate state without a leader before we attempt an election.
|
||||
|
||||
#### `heartbeat-timeout`
|
||||
|
||||
Default is `"1s"`.
|
||||
|
||||
The amount of time in follower state without a leader before we attempt an election.
|
||||
|
||||
#### `leader-lease-timeout`
|
||||
|
||||
Default is `"500ms"`.
|
||||
|
||||
The leader lease timeout is the amount of time a Raft leader will remain leader
|
||||
if it does not hear from a majority of nodes.
|
||||
After the timeout the leader steps down to the follower state.
|
||||
Clusters with high latency between nodes may want to increase this parameter to
|
||||
avoid unnecessary Raft elections.
|
||||
|
||||
Environment variable: `INFLUXDB_META_LEADER_LEASE_TIMEOUT`
|
||||
|
||||
#### `commit-timeout`
|
||||
|
||||
Default is `"50ms"`.
|
||||
|
||||
The commit timeout is the amount of time a Raft node will tolerate between
|
||||
commands before issuing a heartbeat to tell the leader it is alive.
|
||||
The default setting should work for most systems.
|
||||
|
||||
Environment variable: `INFLUXDB_META_COMMIT_TIMEOUT`
|
||||
|
||||
#### `consensus-timeout`
|
||||
|
||||
Default is `"30s"`.
|
||||
|
||||
Timeout waiting for consensus before getting the latest Raft snapshot.
|
||||
|
||||
Environment variable: `INFLUXDB_META_CONSENSUS_TIMEOUT`
|
||||
|
||||
#### `cluster-tracing`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
Log all HTTP requests made to meta nodes.
|
||||
Prints sanitized POST request information to show actual commands.
|
||||
|
||||
**Sample log output:**
|
||||
|
||||
```
|
||||
ts=2021-12-08T02:00:54.864731Z lvl=info msg=weblog log_id=0YHxBFZG001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"fipple\",\"password\":[REDACTED]}}': ''}" status=307 size=0 referrer= user-agent=curl/7.68.0 request-id=ad87ce47-57ca-11ec-8026-0242ac120004 execution-time=63.571ms execution-time-readable=63.570738ms
|
||||
ts=2021-12-08T02:01:00.070137Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"fipple\",\"password\":[REDACTED]}}': ''}" status=200 size=0 referrer= user-agent=curl/7.68.0 request-id=b09eb13a-57ca-11ec-800d-0242ac120003 execution-time=85.823ms execution-time-readable=85.823406ms
|
||||
ts=2021-12-08T02:01:29.062313Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"gremch\",\"hash\":[REDACTED]}}': ''}" status=200 size=0 referrer= user-agent=curl/7.68.0 request-id=c1f3614a-57ca-11ec-8015-0242ac120003 execution-time=1.722ms execution-time-readable=1.722089ms
|
||||
ts=2021-12-08T02:01:47.457607Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"gremchy\",\"hash\":[REDACTED]}}': ''}" status=400 size=37 referrer= user-agent=curl/7.68.0 request-id=ccea84b7-57ca-11ec-8019-0242ac120003 execution-time=0.154ms execution-time-readable=154.417µs
|
||||
ts=2021-12-08T02:02:05.522571Z lvl=info msg=weblog log_id=0YHxBEhl001 service=meta-http host=172.18.0.1 user-id= username=admin method=POST uri=/user protocol=HTTP/1.1 command="{'{\"action\":\"create\",\"user\":{\"name\":\"thimble\",\"password\":[REDACTED]}}': ''}" status=400 size=37 referrer= user-agent=curl/7.68.0 request-id=d7af0082-57ca-11ec-801f-0242ac120003 execution-time=0.227ms execution-time-readable=227.853µs
|
||||
```
|
||||
|
||||
Environment variable: `INFLUXDB_META_CLUSTER_TRACING`
|
||||
|
||||
#### `logging-enabled`
|
||||
|
||||
Default is `true`.
|
||||
|
||||
Meta logging toggles the logging of messages from the meta service.
|
||||
|
||||
Environment variable: `INFLUXDB_META_LOGGING_ENABLED`
|
||||
|
||||
#### `pprof-enabled`
|
||||
|
||||
Default is `true`.
|
||||
|
||||
Enables the `/debug/pprof` endpoint for troubleshooting.
|
||||
To disable, set the value to `false`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_PPROF_ENABLED`
|
||||
|
||||
#### `lease-duration`
|
||||
|
||||
Default is `"1m0s"`.
|
||||
|
||||
The default duration of the leases that data nodes acquire from the meta nodes.
|
||||
Leases automatically expire after the `lease-duration` is met.
|
||||
|
||||
Leases ensure that only one data node is running something at a given time.
|
||||
For example, [continuous queries](/enterprise_influxdb/v1.10/concepts/glossary/#continuous-query-cq)
|
||||
(CQs) use a lease so that all data nodes aren't running the same CQs at once.
|
||||
|
||||
For more details about `lease-duration` and its impact on continuous queries, see
|
||||
[Configuration and operational considerations on a cluster](/enterprise_influxdb/v1.10/features/clustering-features/#configuration-and-operational-considerations-on-a-cluster).
|
||||
|
||||
Environment variable: `INFLUXDB_META_LEASE_DURATION`
|
||||
|
||||
#### `auth-enabled`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
If true, HTTP endpoints require authentication.
|
||||
This setting must have the same value as the data nodes' meta.meta-auth-enabled configuration.
|
||||
|
||||
#### `ldap-allowed`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
Whether LDAP is allowed to be set.
|
||||
If true, you will need to use `influxd ldap set-config` and set enabled=true to use LDAP authentication.
|
||||
|
||||
#### `shared-secret`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
The shared secret to be used by the public API for creating custom JWT authentication.
|
||||
If you use this setting, set [`auth-enabled`](#auth-enabled) to `true`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_SHARED_SECRET`
|
||||
|
||||
#### `internal-shared-secret`
|
||||
|
||||
Default is `""`.
|
||||
|
||||
The shared secret used by the internal API for JWT authentication for
|
||||
inter-node communication within the cluster.
|
||||
Set this to a long pass phrase.
|
||||
This value must be the same value as the
|
||||
[`[meta] meta-internal-shared-secret`](/enterprise_influxdb/v1.10/administration/config-data-nodes#meta-internal-shared-secret) in the data node configuration file.
|
||||
To use this option, set [`auth-enabled`](#auth-enabled) to `true`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_INTERNAL_SHARED_SECRET`
|
||||
|
||||
#### `password-hash`
|
||||
|
||||
Default is `"bcrypt"`.
|
||||
|
||||
Specifies the password hashing scheme and its configuration.
|
||||
|
||||
FIPS-readiness is achieved by specifying an appropriate password hashing scheme, such as `pbkdf2-sha256` or `pbkdf2-sha512`.
|
||||
The configured password hashing scheme and its FIPS readiness are logged on startup of `influxd` and `influxd-meta` for verification and auditing purposes.
|
||||
|
||||
The configuration is a semicolon delimited list.
|
||||
The first section specifies the password hashing scheme.
|
||||
Optional sections after this are `key=value` password hash configuration options.
|
||||
Each scheme has its own set of options.
|
||||
Any options not specified default to reasonable values as specified below.
|
||||
|
||||
This setting must have the same value as the data node option [`meta.password-hash`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#password-hash).
|
||||
|
||||
Environment variable: `INFLUXDB_META_PASSWORD_HASH`
|
||||
|
||||
**Example hashing configurations:**
|
||||
|
||||
| String | Description | FIPS ready |
|
||||
|:-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|------------|
|
||||
| `bcrypt` | Specifies the [`bcrypt`](#bcrypt) hashing scheme with default options. | No |
|
||||
| `pbkdf2-sha256;salt_len=32;rounds=64000` | Specifies the [`pbkdf2-sha256`](#pbkdf2-sha256) hashing scheme with options `salt_len` set to `32` and `rounds` set to `64000`. | Yes |
|
||||
|
||||
Supported password hashing schemes and options:
|
||||
|
||||
##### `bcrypt`
|
||||
|
||||
`bcrypt` is the default hashing scheme.
|
||||
It is not a FIPS-ready password hashing scheme.
|
||||
|
||||
**Options:**
|
||||
|
||||
* `cost`
|
||||
* Specifies the cost of hashing.
|
||||
Number of rounds performed is `2^cost`.
|
||||
Higher cost gives greater security at the expense of execution time.
|
||||
* Default value: `10`
|
||||
* Valid range: [`4`, `31`]
|
||||
|
||||
##### `pbkdf2-sha256`
|
||||
|
||||
`pbkdf2-sha256` uses the PBKDF2 scheme with SHA-256 as the HMAC function.
|
||||
It is FIPS-ready according to [NIST Special Publication 800-132] §5.3
|
||||
when used with appropriate `rounds` and `salt_len` options.
|
||||
|
||||
**Options:**
|
||||
|
||||
* `rounds`
|
||||
* Specifies the number of rounds to perform.
|
||||
Higher cost gives greater security at the expense of execution time.
|
||||
* Default value: `29000`
|
||||
* Valid range: [`1`, `4294967295`]
|
||||
* Must be greater than or equal to `1000`
|
||||
for FIPS-readiness according to [NIST Special Publication 800-132] §5.2.
|
||||
* `salt_len`
|
||||
* Specifies the salt length in bytes.
|
||||
The longer the salt, the more difficult it is for an attacker to generate a table of password hashes.
|
||||
* Default value: `16`
|
||||
* Valid range: [`1`, `1024`]
|
||||
* Must be greater than or equal to `16`
|
||||
for FIPS-readiness according to [NIST Special Publication 800-132] §5.1.
|
||||
|
||||
##### `pbkdf2-sha512`
|
||||
|
||||
`pbkdf2-sha512` uses the PBKDF2 scheme with SHA-256 as the HMAC function.
|
||||
It is FIPS-ready according to [NIST Special Publication 800-132] §5.3
|
||||
when used with appropriate `rounds` and `salt_len` options.
|
||||
|
||||
**Options:**
|
||||
|
||||
* `rounds`
|
||||
* Specifies the number of rounds to perform.
|
||||
Higher cost gives greater security at the expense of execution time.
|
||||
* Default value: `29000`
|
||||
* Valid range: [`1`, `4294967295`]
|
||||
* Must be greater than or equal to `1000`
|
||||
for FIPS-readiness according to [NIST Special Publication 800-132] § 5.2.
|
||||
* `salt_len`
|
||||
* Specifies the salt length in bytes.
|
||||
The longer the salt, the more difficult it is for an attacker to generate a table of password hashes.
|
||||
* Default value: `16`
|
||||
* Valid range: [`1`, `1024`]
|
||||
* Must be greater than or equal to `16`
|
||||
for FIPS-readiness according to [NIST Special Publication 800-132] § 5.1.
|
||||
|
||||
#### `ensure-fips`
|
||||
|
||||
Default is `false`.
|
||||
|
||||
If `ensure-fips` is set to `true`, then `influxd` and `influxd-meta`
|
||||
will refuse to start if they are not configured in a FIPS-ready manner.
|
||||
For example, `password-hash = "bcrypt"` would not be allowed if `ensure-fips = true`.
|
||||
`ensure-fips` gives the administrator extra confidence that their instances are configured in a FIPS-ready manner.
|
||||
|
||||
Environment variable: `INFLUXDB_META_ENSURE_FIPS`
|
||||
|
||||
[NIST Special Publication 800-132]: https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-132.pdf
|
||||
|
||||
### TLS settings
|
||||
|
||||
For more information, see [TLS settings for data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes#tls-settings).
|
||||
|
||||
#### Recommended "modern compatibility" cipher settings
|
||||
|
||||
```toml
|
||||
ciphers = [ "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
|
||||
"TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
|
||||
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
|
||||
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
|
||||
"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
|
||||
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"
|
||||
]
|
||||
|
||||
min-version = "tls1.3"
|
||||
|
||||
max-version = "tls1.3"
|
||||
|
||||
```
|
|
@ -0,0 +1,183 @@
|
|||
---
|
||||
title: Configure InfluxDB Enterprise clusters
|
||||
description: >
|
||||
Learn about global options, meta node options, data node options and other InfluxDB Enterprise configuration settings, including
|
||||
aliases:
|
||||
- /enterprise/v1.10/administration/configuration/
|
||||
- /enterprise_influxdb/v1.10/administration/configuration/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure clusters
|
||||
parent: Configure
|
||||
weight: 10
|
||||
---
|
||||
|
||||
This page contains general information about configuring InfluxDB Enterprise clusters.
|
||||
For complete listings and descriptions of the configuration settings, see:
|
||||
|
||||
* [Configure data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes)
|
||||
* [Configure meta nodes](/enterprise_influxdb/v1.10/administration/config-meta-nodes)
|
||||
|
||||
## Use configuration files
|
||||
|
||||
### Display the default configurations
|
||||
|
||||
The following commands print out a TOML-formatted configuration with all
|
||||
available options set to their default values.
|
||||
|
||||
#### Meta node configuration
|
||||
|
||||
```bash
|
||||
influxd-meta config
|
||||
```
|
||||
|
||||
#### Data node configuration
|
||||
|
||||
```bash
|
||||
influxd config
|
||||
```
|
||||
|
||||
#### Create a configuration file
|
||||
|
||||
On POSIX systems, generate a new configuration file by redirecting the output
|
||||
of the command to a file.
|
||||
|
||||
New meta node configuration file:
|
||||
```
|
||||
influxd-meta config > /etc/influxdb/influxdb-meta-generated.conf
|
||||
```
|
||||
|
||||
New data node configuration file:
|
||||
```
|
||||
influxd config > /etc/influxdb/influxdb-generated.conf
|
||||
```
|
||||
|
||||
Preserve custom settings from older configuration files when generating a new
|
||||
configuration file with the `-config` option.
|
||||
For example, this overwrites any default configuration settings in the output
|
||||
file (`/etc/influxdb/influxdb.conf.new`) with the configuration settings from
|
||||
the file (`/etc/influxdb/influxdb.conf.old`) passed to `-config`:
|
||||
|
||||
```
|
||||
influxd config -config /etc/influxdb/influxdb.conf.old > /etc/influxdb/influxdb.conf.new
|
||||
```
|
||||
|
||||
#### Launch the process with a configuration file
|
||||
|
||||
There are two ways to launch the meta or data processes using your customized
|
||||
configuration file.
|
||||
|
||||
* Point the process to the desired configuration file with the `-config` option.
|
||||
|
||||
To start the meta node process with `/etc/influxdb/influxdb-meta-generate.conf`:
|
||||
|
||||
influxd-meta -config /etc/influxdb/influxdb-meta-generate.conf
|
||||
|
||||
To start the data node process with `/etc/influxdb/influxdb-generated.conf`:
|
||||
|
||||
influxd -config /etc/influxdb/influxdb-generated.conf
|
||||
|
||||
|
||||
* Set the environment variable `INFLUXDB_CONFIG_PATH` to the path of your
|
||||
configuration file and start the process.
|
||||
|
||||
To set the `INFLUXDB_CONFIG_PATH` environment variable and launch the data
|
||||
process using `INFLUXDB_CONFIG_PATH` for the configuration file path:
|
||||
|
||||
export INFLUXDB_CONFIG_PATH=/root/influxdb.generated.conf
|
||||
echo $INFLUXDB_CONFIG_PATH
|
||||
/root/influxdb.generated.conf
|
||||
influxd
|
||||
|
||||
If set, the command line `-config` path overrides any environment variable path.
|
||||
If you do not supply a configuration file, InfluxDB uses an internal default
|
||||
configuration (equivalent to the output of `influxd config` and `influxd-meta
|
||||
config`).
|
||||
|
||||
{{% warn %}} Note for 1.3, the influxd-meta binary, if no configuration is specified, will check the INFLUXDB_META_CONFIG_PATH.
|
||||
If that environment variable is set, the path will be used as the configuration file.
|
||||
If unset, the binary will check the ~/.influxdb and /etc/influxdb folder for an influxdb-meta.conf file.
|
||||
If it finds that file at either of the two locations, the first will be loaded as the configuration file automatically.
|
||||
<br>
|
||||
This matches a similar behavior that the open source and data node versions of InfluxDB already follow.
|
||||
{{% /warn %}}
|
||||
|
||||
Configure InfluxDB using the configuration file (`influxdb.conf`) and environment variables.
|
||||
The default value for each configuration setting is shown in the documentation.
|
||||
Commented configuration options use the default value.
|
||||
|
||||
Configuration settings with a duration value support the following duration units:
|
||||
|
||||
- `ns` _(nanoseconds)_
|
||||
- `us` or `µs` _(microseconds)_
|
||||
- `ms` _(milliseconds)_
|
||||
- `s` _(seconds)_
|
||||
- `m` _(minutes)_
|
||||
- `h` _(hours)_
|
||||
- `d` _(days)_
|
||||
- `w` _(weeks)_
|
||||
|
||||
### Environment variables
|
||||
|
||||
All configuration options can be specified in the configuration file or in
|
||||
environment variables.
|
||||
Environment variables override the equivalent options in the configuration
|
||||
file.
|
||||
If a configuration option is not specified in either the configuration file
|
||||
or in an environment variable, InfluxDB uses its internal default
|
||||
configuration.
|
||||
|
||||
In the sections below we name the relevant environment variable in the
|
||||
description for the configuration setting.
|
||||
Environment variables can be set in `/etc/default/influxdb-meta` and
|
||||
`/etc/default/influxdb`.
|
||||
|
||||
> **Note:**
|
||||
To set or override settings in a config section that allows multiple
|
||||
configurations (any section with double_brackets (`[[...]]`) in the header supports
|
||||
multiple configurations), the desired configuration must be specified by ordinal
|
||||
number.
|
||||
For example, for the first set of `[[graphite]]` environment variables,
|
||||
prefix the configuration setting name in the environment variable with the
|
||||
relevant position number (in this case: `0`):
|
||||
>
|
||||
INFLUXDB_GRAPHITE_0_BATCH_PENDING
|
||||
INFLUXDB_GRAPHITE_0_BATCH_SIZE
|
||||
INFLUXDB_GRAPHITE_0_BATCH_TIMEOUT
|
||||
INFLUXDB_GRAPHITE_0_BIND_ADDRESS
|
||||
INFLUXDB_GRAPHITE_0_CONSISTENCY_LEVEL
|
||||
INFLUXDB_GRAPHITE_0_DATABASE
|
||||
INFLUXDB_GRAPHITE_0_ENABLED
|
||||
INFLUXDB_GRAPHITE_0_PROTOCOL
|
||||
INFLUXDB_GRAPHITE_0_RETENTION_POLICY
|
||||
INFLUXDB_GRAPHITE_0_SEPARATOR
|
||||
INFLUXDB_GRAPHITE_0_TAGS
|
||||
INFLUXDB_GRAPHITE_0_TEMPLATES
|
||||
INFLUXDB_GRAPHITE_0_UDP_READ_BUFFER
|
||||
>
|
||||
For the Nth Graphite configuration in the configuration file, the relevant
|
||||
environment variables would be of the form `INFLUXDB_GRAPHITE_(N-1)_BATCH_PENDING`.
|
||||
For each section of the configuration file the numbering restarts at zero.
|
||||
|
||||
### `GOMAXPROCS` environment variable
|
||||
|
||||
{{% note %}}
|
||||
_**Note:**_ `GOMAXPROCS` cannot be set using the InfluxDB configuration file.
|
||||
It can only be set as an environment variable.
|
||||
{{% /note %}}
|
||||
|
||||
The `GOMAXPROCS` [Go language environment variable](https://golang.org/pkg/runtime/#hdr-Environment_Variables)
|
||||
can be used to set the maximum number of CPUs that can execute simultaneously.
|
||||
|
||||
The default value of `GOMAXPROCS` is the number of CPUs
|
||||
that are visible to the program *on startup*
|
||||
(based on what the operating system considers to be a CPU).
|
||||
For a 32-core machine, the `GOMAXPROCS` value would be `32`.
|
||||
You can override this value to be less than the maximum value,
|
||||
which can be useful in cases where you are running the InfluxDB
|
||||
along with other processes on the same machine
|
||||
and want to ensure that the database doesn't negatively affect those processes.
|
||||
|
||||
{{% note %}}
|
||||
_**Note:**_ Setting `GOMAXPROCS=1` eliminates all parallelization.
|
||||
{{% /note %}}
|
|
@ -0,0 +1,94 @@
|
|||
---
|
||||
title: Configure TCP and UDP ports used in InfluxDB Enterprise
|
||||
description: Configure TCP and UDP ports in InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure TCP and UDP Ports
|
||||
parent: Configure
|
||||
weight: 60
|
||||
aliases:
|
||||
- /enterprise/v1.10/administration/ports/
|
||||
- /enterprise_influxdb/v1.10/administration/ports/
|
||||
---
|
||||
|
||||
![InfluxDB Enterprise network diagram](/img/enterprise/1-8-network-diagram.png)
|
||||
|
||||
## Enabled ports
|
||||
|
||||
### 8086
|
||||
|
||||
The default port that runs the InfluxDB HTTP service.
|
||||
It is used for the primary public write and query API.
|
||||
Clients include the CLI, Chronograf, InfluxDB client libraries, Grafana, curl, or anything that wants to write and read time series data to and from InfluxDB.
|
||||
[Configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#bind-address)
|
||||
in the data node configuration file.
|
||||
|
||||
_See also: [API Reference](/enterprise_influxdb/v1.10/tools/api/)._
|
||||
|
||||
### 8088
|
||||
|
||||
Data nodes listen on this port.
|
||||
Primarily used by other data nodes to handle distributed reads and writes at runtime.
|
||||
Used to control a data node (e.g., tell it to write to a specific shard or execute a query).
|
||||
It's also used by meta nodes for cluster-type operations (e.g., tell a data node to join or leave the cluster).
|
||||
|
||||
This is the default port used for RPC calls used for inter-node communication and by the CLI for backup and restore operations
|
||||
(`influxdb backup` and `influxd restore`).
|
||||
[Configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#bind-address)
|
||||
in the configuration file.
|
||||
|
||||
This port should not be exposed outside the cluster.
|
||||
|
||||
_See also: [Back up and restore](/enterprise_influxdb/v1.10/administration/backup-and-restore/)._
|
||||
|
||||
### 8089
|
||||
|
||||
Used for communcation between meta nodes.
|
||||
It is used by the Raft consensus protocol.
|
||||
The only clients using `8089` should be the other meta nodes in the cluster.
|
||||
|
||||
This port should not be exposed outside the cluster.
|
||||
|
||||
### 8091
|
||||
|
||||
Meta nodes listen on this port.
|
||||
It is used for the meta service API.
|
||||
Primarily used by data nodes to stay in sync about databases, retention policies, shards, users, privileges, etc.
|
||||
Used by meta nodes to receive incoming connections by data nodes and Chronograf.
|
||||
Clients also include the `influxd-ctl` command line tool and Chronograph,
|
||||
|
||||
This port should not be exposed outside the cluster.
|
||||
|
||||
## Disabled ports
|
||||
|
||||
### 2003
|
||||
|
||||
The default port that runs the Graphite service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/config-data-nodes/#bind-address-2003)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [Graphite README](https://github.com/influxdata/influxdb/tree/1.8/services/graphite/README.md)
|
||||
|
||||
### 4242
|
||||
|
||||
The default port that runs the OpenTSDB service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#opentsdb-settings)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [OpenTSDB README](https://github.com/influxdata/influxdb/tree/1.8/services/opentsdb/README.md)
|
||||
|
||||
### 8089
|
||||
|
||||
The default port that runs the UDP service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#udp-settings)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [UDP README](https://github.com/influxdata/influxdb/tree/1.8/services/udp/README.md)
|
||||
|
||||
### 25826
|
||||
|
||||
The default port that runs the Collectd service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#collectd-settings)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [Collectd README](https://github.com/influxdata/influxdb/tree/1.8/services/collectd/README.md)
|
|
@ -0,0 +1,16 @@
|
|||
---
|
||||
title: Configure security
|
||||
description: Configure security features in InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure security
|
||||
parent: Configure
|
||||
weight: 40
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/security/
|
||||
---
|
||||
|
||||
_For user and permission management (authorization),
|
||||
see [Manage users and permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/)._
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,114 @@
|
|||
---
|
||||
title: Configure authentication
|
||||
description: >
|
||||
Enable authentication to require credentials for a cluster.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Configure security
|
||||
name: Configure authentication
|
||||
weight: 10
|
||||
---
|
||||
|
||||
To configure authentication, do one of the following:
|
||||
|
||||
- [Enable authentication](#enable-authentication)
|
||||
- [Configure authentication using JWT tokens](#configure-authentication-using-jwt-tokens) ([InfluxDB HTTP API](/enterprise_influxdb/v1.10/tools/api/) only)
|
||||
|
||||
## Enable authentication
|
||||
|
||||
Authentication is disabled by default in InfluxDB and InfluxDB Enterprise.
|
||||
After [installing the data nodes](/enterprise_influxdb/v1.10/introduction/install-and-deploy/installation/data_node_installation/),
|
||||
enable authentication to control access to your cluster.
|
||||
|
||||
To enable authentication in a cluster, do the following:
|
||||
|
||||
1. Set `auth-enabled` to `true` in the `[http]` section of the configuration files
|
||||
for all meta **and** data nodes:
|
||||
```toml
|
||||
[http]
|
||||
# ...
|
||||
auth-enabled = true
|
||||
```
|
||||
1. Next, create an admin user (if you haven't already).
|
||||
Using the [`influx` CLI](/enterprise_influxdb/v1.10/tools/influx-cli/),
|
||||
run the following command:
|
||||
```
|
||||
CREATE USER admin WITH PASSWORD 'mypassword' WITH ALL PRIVILEGES
|
||||
```
|
||||
1. Restart InfluxDB Enterprise.
|
||||
Once restarted, InfluxDB Enterprise checks user credentials on every request
|
||||
and only processes requests with valid credentials.
|
||||
|
||||
## Configure authentication using JWT tokens
|
||||
|
||||
For a more secure alternative to using passwords, include JWT tokens in requests to the InfluxDB API.
|
||||
|
||||
1. **Add a shared secret in your InfluxDB Enterprise configuration file**.
|
||||
|
||||
InfluxDB Enterprise uses the shared secret to encode the JWT signature.
|
||||
By default, `shared-secret` is set to an empty string (no JWT authentication).
|
||||
Add a custom shared secret in your [InfluxDB configuration file](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#shared-secret)
|
||||
for each meta and data node.
|
||||
Longer strings are more secure:
|
||||
|
||||
```toml
|
||||
[http]
|
||||
shared-secret = "my super secret pass phrase"
|
||||
```
|
||||
|
||||
Alternatively, to avoid keeping your secret phrase as plain text in your InfluxDB configuration file,
|
||||
set the value with the `INFLUXDB_HTTP_SHARED_SECRET` environment variable (for example, in Linux: `export INFLUXDB_HTTP_SHARED_SECRET=MYSUPERSECRETPASSPHRASE`).
|
||||
|
||||
2. **Generate your JWT token**.
|
||||
|
||||
Use an authentication service (such as, [https://jwt.io/](https://jwt.io/))
|
||||
to generate a secure token using your InfluxDB username, an expiration time, and your shared secret.
|
||||
|
||||
The payload (or claims) of the token must be in the following format:
|
||||
|
||||
```json
|
||||
{
|
||||
"username": "myUserName",
|
||||
"exp": 1516239022
|
||||
}
|
||||
```
|
||||
|
||||
- **username** - InfluxDB username.
|
||||
- **exp** - Token expiration in UNIX [epoch time](/enterprise_influxdb/v1.10/query_language/explore-data/#epoch_time).
|
||||
For increased security, keep token expiration periods short.
|
||||
For testing, you can manually generate UNIX timestamps using [https://www.unixtimestamp.com/index.php](https://www.unixtimestamp.com/index.php).
|
||||
|
||||
To encode the payload using your shared secret, use a JWT library in your own authentication server or encode by hand at [https://jwt.io/](https://jwt.io/).
|
||||
|
||||
3. **Include the token in HTTP requests**.
|
||||
|
||||
Include your generated token as part of the `Authorization` header in HTTP requests:
|
||||
|
||||
```
|
||||
Authorization: Bearer <myToken>
|
||||
```
|
||||
{{% note %}}
|
||||
Only unexpired tokens will successfully authenticate.
|
||||
Verify your token has not expired.
|
||||
{{% /note %}}
|
||||
|
||||
#### Example query request with JWT authentication
|
||||
```bash
|
||||
curl -G "http://localhost:8086/query?db=demodb" \
|
||||
--data-urlencode "q=SHOW DATABASES" \
|
||||
--header "Authorization: Bearer <header>.<payload>.<signature>"
|
||||
```
|
||||
|
||||
## Authentication and authorization HTTP errors
|
||||
|
||||
Requests with no authentication credentials or incorrect credentials yield the `HTTP 401 Unauthorized` response.
|
||||
|
||||
Requests by unauthorized users yield the `HTTP 403 Forbidden` response.
|
||||
|
||||
## Next steps
|
||||
|
||||
After configuring authentication,
|
||||
you can [manage users and permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/)
|
||||
as necessary.
|
||||
|
||||
{{% enterprise-warning-authn-b4-authz %}}
|
|
@ -0,0 +1,87 @@
|
|||
---
|
||||
title: Configure password hashing
|
||||
description: >
|
||||
Configure the cryptographic algorithm used for password hashing.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure password hashing
|
||||
parent: Configure security
|
||||
weight: 40
|
||||
related:
|
||||
- /enterprise_influxdb/v1.10/administration/configuration/
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/configure-password-hashing/
|
||||
- /enterprise_influxdb/v1.10/administration/manage/configure-password-hashing/
|
||||
---
|
||||
|
||||
By default, InfluxDB Enterprise uses `bcrypt` for password hashing.
|
||||
[FIPS] compliance requires particular hashing alorithms.
|
||||
Use `pbkdf2-sha256` or `pbkdf2-sha512` for FIPS compliance.
|
||||
|
||||
## Change password hashing algorithm
|
||||
|
||||
Complete the following steps
|
||||
to change the password hashing algorithm used by an existing InfluxDB Enterprise cluster:
|
||||
|
||||
1. Ensure all meta and data nodes are running InfluxDB Enterprise 1.10.3 or later.
|
||||
2. In your meta node and data node configuration files, set [`password-hash`] to one of the following:
|
||||
`pbkdf2-sha256`, or `pbkdf2-sha512`.
|
||||
Also set [`ensure-fips`] to `true`.
|
||||
|
||||
{{% note %}}
|
||||
The `meta.password-hash` setting must be the same in both the data and meta node configuration files.
|
||||
{{% /note %}}
|
||||
3. Restart each meta and data node to load the configuration change.
|
||||
4. To apply the new hashing algorithm, you must [reset](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/#reset-a-users-password)
|
||||
all existing passwords in the cluster.
|
||||
Otherwise, the previous algorithm will continue to be used.
|
||||
|
||||
## Example configuration
|
||||
|
||||
**Example data node configuration:**
|
||||
|
||||
```toml
|
||||
[meta]
|
||||
# Configures password hashing scheme. Use "pbkdf2-sha256" or "pbkdf2-sha512"
|
||||
# for a FIPS-ready password hash. This setting must have the same value as
|
||||
# the meta nodes' meta.password-hash configuration.
|
||||
password-hash = "pbkdf2-sha256"
|
||||
|
||||
# Configures strict FIPS-readiness check on startup.
|
||||
ensure-fips = true
|
||||
```
|
||||
|
||||
**Example meta node configuration:**
|
||||
|
||||
```toml
|
||||
[meta]
|
||||
# Configures password hashing scheme. Use "pbkdf2-sha256" or "pbkdf2-sha512"
|
||||
# for a FIPS-ready password hash. This setting must have the same value as
|
||||
# the data nodes' meta.password-hash configuration.
|
||||
password-hash = "pbkdf2-sha256"
|
||||
|
||||
# Configures strict FIPS-readiness check on startup.
|
||||
ensure-fips = true
|
||||
```
|
||||
|
||||
## Using FIPS readiness checks
|
||||
|
||||
InfluxDB Enterprise outputs information about the current password hashing configuration at startup.
|
||||
For example:
|
||||
|
||||
```
|
||||
2021-07-21T17:20:44.024846Z info Password hashing configuration: pbkdf2-sha256;rounds=29000;salt_len=16 {"log_id": "0VUXBWE0001"}
|
||||
2021-07-21T17:20:44.024857Z info Password hashing is FIPS-ready: true {"log_id": "0VUXBWE0001"}
|
||||
```
|
||||
|
||||
When `ensure-fips` is enabled, attempting to use `password-hash = bcrypt`
|
||||
will cause the FIPS check to fail.
|
||||
The node then exits with an error in the logs:
|
||||
|
||||
```
|
||||
run: create server: passwordhash: not FIPS-ready: config: 'bcrypt'
|
||||
```
|
||||
|
||||
[FIPS]: https://csrc.nist.gov/publications/detail/fips/140/3/final
|
||||
[`password-hash`]: /enterprise_influxdb/v1.10/administration/config-meta-nodes/#password-hash
|
||||
[`ensure-fips`]: /enterprise_influxdb/v1.10/administration/config-meta-nodes/#ensure-fips
|
|
@ -0,0 +1,297 @@
|
|||
---
|
||||
title: Configure HTTPS over TLS
|
||||
description: >
|
||||
Enabling HTTPS over TLS encrypts the communication between clients and the InfluxDB Enterprise server, and between nodes in the cluster.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure TLS for cluster
|
||||
parent: Configure security
|
||||
weight: 20
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/https_setup/
|
||||
- /enterprise_influxdb/v1.10/guides/enable_tls/
|
||||
- /enterprise_influxdb/v1.10/guides/enable-tls/
|
||||
---
|
||||
|
||||
Enabling HTTPS over TLS encrypts the communication between clients and the InfluxDB Enterprise server, and between nodes in the cluster.
|
||||
When configured with a signed certificate, HTTPS over TLS can also verify the authenticity of the InfluxDB Enterprise server to connecting clients.
|
||||
|
||||
This pages outlines how to set up HTTPS with InfluxDB Enterprise using either a signed or self-signed certificate.
|
||||
|
||||
{{% warn %}}
|
||||
InfluxData **strongly recommends** enabling HTTPS, especially if you plan on sending requests to InfluxDB Enterprise over a network.
|
||||
{{% /warn %}}
|
||||
|
||||
{{% note %}}
|
||||
These steps have been tested on Debian-based Linux distributions.
|
||||
Specific steps may vary on other operating systems.
|
||||
{{% /note %}}
|
||||
|
||||
## Requirements
|
||||
|
||||
To enable HTTPS with InfluxDB Enterprise, you need a Transport Layer Security (TLS) certificate, also known as a Secured Sockets Layer (SSL) certificate.
|
||||
InfluxDB supports three types of TLS certificates:
|
||||
|
||||
* **Single domain certificates signed by a [Certificate Authority](https://en.wikipedia.org/wiki/Certificate_authority)**
|
||||
|
||||
Single domain certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
|
||||
These certificates are signed and issued by a trusted, third-party Certificate Authority (CA).
|
||||
With this certificate option, every InfluxDB instance requires a unique single domain certificate.
|
||||
|
||||
* **Wildcard certificates signed by a Certificate Authority**
|
||||
|
||||
These certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
|
||||
Wildcard certificates can be used across multiple InfluxDB Enterprise instances on different servers.
|
||||
|
||||
* **Self-signed certificates**
|
||||
|
||||
Self-signed certificates are _not_ signed by a trusted, third-party CA.
|
||||
Unlike CA-signed certificates, self-signed certificates only provide cryptographic security to HTTPS requests.
|
||||
They do not allow clients to verify the identity of the InfluxDB server.
|
||||
With this certificate option, every InfluxDB Enterprise instance requires a unique self-signed certificate.
|
||||
You can generate a self-signed certificate on your own machine.
|
||||
|
||||
Regardless of your certificate's type, InfluxDB Enterprise supports certificates composed of
|
||||
a private key file (`.key`) and a signed certificate file (`.crt`) file pair, as well as certificates
|
||||
that combine the private key file and the signed certificate file into a single bundled file (`.pem`).
|
||||
|
||||
In general, each node node should have its own certificate, whether signed or unsiged.
|
||||
|
||||
## Set up HTTPS in an InfluxDB Enterprise cluster
|
||||
|
||||
1. **Download or generate certificate files.**
|
||||
|
||||
If using a certificate provided by a CA, follow their instructions to download the certificate files.
|
||||
|
||||
{{% note %}}
|
||||
If using one or more self-signed certificates, use the `openssl` utility to create a certificate.
|
||||
The following command generates a private key file (`.key`) and a self-signed
|
||||
certificate file (`.crt`) which remain valid for the specified `NUMBER_OF_DAYS`.
|
||||
|
||||
```sh
|
||||
sudo openssl req -x509 -nodes -newkey rsa:2048 \
|
||||
-keyout influxdb-selfsigned.key \
|
||||
-out influxdb-selfsigned.crt \
|
||||
-days <NUMBER_OF_DAYS>
|
||||
```
|
||||
|
||||
The command will prompt you for more information.
|
||||
You can choose to fill out these fields or leave them blank; both actions generate valid certificate files.
|
||||
|
||||
In subsequent steps, you will need to copy the certificate and key (or `.pem` file) to each node in the cluster.
|
||||
{{% /note %}}
|
||||
|
||||
2. **Install the SSL/TLS certificate in each node.**
|
||||
|
||||
Place the private key file (`.key`) and the signed certificate file (`.crt`)
|
||||
or the single bundled file (`.pem`)
|
||||
in the `/etc/ssl/` directory of each meta node and data node.
|
||||
|
||||
{{% note %}}
|
||||
Some Certificate Authorities provide certificate files with other extensions.
|
||||
Consult your CA if you are unsure about how to use these files.
|
||||
{{% /note %}}
|
||||
|
||||
3. **Ensure file permissions for each node.**
|
||||
|
||||
Certificate files require read and write access by the `influxdb` user.
|
||||
Ensure that you have the correct file permissions in each meta node and data node by running the following commands:
|
||||
|
||||
```sh
|
||||
sudo chown influxdb:influxdb /etc/ssl/
|
||||
sudo chmod 644 /etc/ssl/<CA-certificate-file>
|
||||
sudo chmod 600 /etc/ssl/<private-key-file>
|
||||
```
|
||||
|
||||
4. **Enable HTTPS within the configuration file for each meta node.**
|
||||
|
||||
Enable HTTPS for each meta node within the `[meta]` section of the meta node configuration file (`influxdb-meta.conf`) by setting:
|
||||
|
||||
```toml
|
||||
[meta]
|
||||
|
||||
[...]
|
||||
|
||||
# Determines whether HTTPS is enabled.
|
||||
https-enabled = true
|
||||
|
||||
# The SSL certificate to use when HTTPS is enabled.
|
||||
https-certificate = "influxdb-meta.crt"
|
||||
|
||||
# Use a separate private key location.
|
||||
https-private-key = "influxdb-meta.key"
|
||||
|
||||
# If using a self-signed certificate:
|
||||
https-insecure-tls = true
|
||||
|
||||
# Use TLS when communicating with data notes
|
||||
data-use-tls = true
|
||||
data-insecure-tls = true
|
||||
|
||||
```
|
||||
|
||||
5. **Enable HTTPS within the configuration file for each data node.**
|
||||
|
||||
Make the following sets of changes in the configuration file (`influxdb.conf`) on each data node:
|
||||
1. Enable HTTPS for each data node within the `[http]` section of the configuration file by setting:
|
||||
```toml
|
||||
[http]
|
||||
|
||||
[...]
|
||||
|
||||
# Determines whether HTTPS is enabled.
|
||||
https-enabled = true
|
||||
|
||||
[...]
|
||||
|
||||
# The SSL certificate to use when HTTPS is enabled.
|
||||
https-certificate = "influxdb-data.crt"
|
||||
|
||||
# Use a separate private key location.
|
||||
https-private-key = "influxdb-data.key"
|
||||
```
|
||||
2. Configure the data nodes to use HTTPS when communicating with other data nodes.
|
||||
In the `[cluster]` section of the configuration file, set the following:
|
||||
```toml
|
||||
[cluster]
|
||||
|
||||
[...]
|
||||
|
||||
# Determines whether data nodes use HTTPS to communicate with each other.
|
||||
https-enabled = true
|
||||
|
||||
# The SSL certificate to use when HTTPS is enabled.
|
||||
https-certificate = "influxdb-data.crt"
|
||||
|
||||
# Use a separate private key location.
|
||||
https-private-key = "influxdb-data.key"
|
||||
|
||||
# If using a self-signed certificate:
|
||||
https-insecure-tls = true
|
||||
```
|
||||
3. Configure the data nodes to use HTTPS when communicating with the meta nodes.
|
||||
In the `[meta]` section of the configuration file, set the following:
|
||||
```toml
|
||||
[meta]
|
||||
|
||||
[...]
|
||||
meta-tls-enabled = true
|
||||
|
||||
# If using a self-signed certificate:
|
||||
meta-insecure-tls = true
|
||||
```
|
||||
6. **Restart InfluxDB Enterprise.**
|
||||
|
||||
Restart the InfluxDB Enterprise processes for the configuration changes to take effect:
|
||||
|
||||
```sh
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
|
||||
Restart the InfluxDB Enterprise data node processes for the configuration changes to take effect:
|
||||
|
||||
```sh
|
||||
sudo systemctl restart influxdb
|
||||
```
|
||||
|
||||
7. **Verify the HTTPS setup.**
|
||||
|
||||
Verify that HTTPS is working on the meta nodes by using `influxd-ctl`.
|
||||
|
||||
```sh
|
||||
influxd-ctl -bind-tls show
|
||||
```
|
||||
|
||||
If using a self-signed certificate, use:
|
||||
|
||||
```sh
|
||||
influxd-ctl -bind-tls -k show
|
||||
```
|
||||
|
||||
{{% warn %}}
|
||||
Once you have enabled HTTPS, you must use `-bind-tls` in order for `influxd-ctl` to connect to the meta node.
|
||||
With a self-signed certificate, you must also use the `-k` option to skip certificate verification.
|
||||
{{% /warn %}}
|
||||
|
||||
A successful connection returns output which should resemble the following:
|
||||
|
||||
```
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 enterprise-data-01:8088 1.x.y-c1.x.y
|
||||
5 enterprise-data-02:8088 1.x.y-c1.x.y
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 1.x.y-c1.x.z
|
||||
enterprise-meta-02:8091 1.x.y-c1.x.z
|
||||
enterprise-meta-03:8091 1.x.y-c1.x.z
|
||||
```
|
||||
|
||||
Next, verify that HTTPS is working by connecting to InfluxDB Enterprise with the [`influx` command line interface](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/):
|
||||
|
||||
```sh
|
||||
influx -ssl -host <domain_name>.com
|
||||
```
|
||||
|
||||
If using a self-signed certificate, use
|
||||
|
||||
```sh
|
||||
influx -ssl -unsafeSsl -host <domain_name>.com
|
||||
```
|
||||
|
||||
A successful connection returns the following:
|
||||
|
||||
```sh
|
||||
Connected to https://<domain_name>.com:8086 version 1.x.y
|
||||
InfluxDB shell version: 1.x.y
|
||||
>
|
||||
```
|
||||
|
||||
That's it! You've successfully set up HTTPS with InfluxDB Enterprise.
|
||||
|
||||
## Connect Telegraf to a secured InfluxDB Enterprise instance
|
||||
|
||||
Connecting [Telegraf](/{{< latest "telegraf" >}}/)
|
||||
to an HTTPS-enabled InfluxDB Enterprise instance requires some additional steps.
|
||||
|
||||
In Telegraf's configuration file (`/etc/telegraf/telegraf.conf`), under the OUTPUT PLUGINS section,
|
||||
edit the `urls` setting to indicate `https` instead of `http`.
|
||||
Also change `localhost` to the relevant domain name.
|
||||
|
||||
The best practice in terms of security is to transfer the certificate to the client and make it trusted
|
||||
(either by putting in the operating system's trusted certificate system or using the `ssl_ca` option).
|
||||
The alternative is to sign the certificate using an internal CA and then trust the CA certificate.
|
||||
Provide the file paths of your key and certificate to the InfluxDB output plugin as shown below.
|
||||
|
||||
If you're using a self-signed certificate,
|
||||
uncomment the `insecure_skip_verify` setting and set it to `true`.
|
||||
|
||||
```toml
|
||||
###############################################################################
|
||||
# OUTPUT PLUGINS #
|
||||
###############################################################################
|
||||
|
||||
# Configuration for influxdb server to send metrics to
|
||||
[[outputs.influxdb]]
|
||||
## The full HTTP or UDP endpoint URL for your InfluxDB Enterprise instance.
|
||||
## Multiple urls can be specified as part of the same cluster,
|
||||
## this means that only ONE of the urls will be written to each interval.
|
||||
# urls = ["udp://localhost:8089"] # UDP endpoint example
|
||||
urls = ["https://<domain_name>.com:8086"]
|
||||
|
||||
[...]
|
||||
|
||||
## Optional SSL Config
|
||||
tls_cert = "/etc/telegraf/cert.pem"
|
||||
tls_key = "/etc/telegraf/key.pem"
|
||||
insecure_skip_verify = true # <-- Update only if you're using a self-signed certificate
|
||||
```
|
||||
|
||||
Next, restart Telegraf and you're all set!
|
||||
|
||||
```sh
|
||||
sudo systemctl restart telegraf
|
||||
```
|
|
@ -0,0 +1,299 @@
|
|||
---
|
||||
title: Configure LDAP authentication
|
||||
description: >
|
||||
Configure LDAP authentication in InfluxDB Enterprise and test LDAP connectivity.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Configure LDAP authentication
|
||||
parent: Configure security
|
||||
weight: 30
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/ldap/
|
||||
- /enterprise_influxdb/v1.10/administration/manage/security/ldap/
|
||||
---
|
||||
|
||||
Configure InfluxDB Enterprise to use LDAP (Lightweight Directory Access Protocol) to:
|
||||
|
||||
- Validate user permissions
|
||||
- Synchronize InfluxDB and LDAP so each LDAP request doesn't need to be queried
|
||||
|
||||
{{% note %}}
|
||||
LDAP **requires** JWT authentication. For more information, see [Configure authentication using JWT tokens](/enterprise_influxdb/v10/administration/configure/security/authentication/#configure-authentication-using-jwt-tokens).
|
||||
|
||||
To configure InfluxDB Enterprise to support LDAP, all users must be managed in the remote LDAP service. If LDAP is configured and enabled, users **must** authenticate through LDAP, including users who may have existed before enabling LDAP.
|
||||
{{% /note %}}
|
||||
|
||||
- [Configure LDAP for an InfluxDB Enterprise cluster](#configure-ldap-for-an-influxdb-enterprise-cluster)
|
||||
- [Sample LDAP configuration](#sample-ldap-configuration)
|
||||
- [Troubleshoot LDAP in InfluxDB Enterprise](#troubleshoot-ldap-in-influxdb-enterprise)
|
||||
|
||||
## Configure LDAP for an InfluxDB Enterprise cluster
|
||||
|
||||
To use LDAP with an InfluxDB Enterprise cluster, do the following:
|
||||
|
||||
1. [Configure data nodes](#configure-data-nodes)
|
||||
2. [Configure meta nodes](#configure-meta-nodes)
|
||||
3. [Create, verify, and upload the LDAP configuration file](#create-verify-and-upload-the-ldap-configuration-file)
|
||||
4. [Restart meta and data nodes](#restart-meta-and-data-nodes)
|
||||
|
||||
### Configure data nodes
|
||||
|
||||
Update the following settings in each data node configuration file (`/etc/influxdb/influxdb.conf`):
|
||||
|
||||
1. Under `[http]`, enable HTTP authentication by setting `auth-enabled` to `true`.
|
||||
(Or set the corresponding environment variable `INFLUXDB_HTTP_AUTH_ENABLED` to `true`.)
|
||||
2. Configure the HTTP shared secret to validate requests using JSON web tokens (JWT) and sign each HTTP payload with the secret and username.
|
||||
Set the `[http]` configuration setting for `shared-secret`, or the corresponding environment variable `INFLUXDB_HTTP_SHARED_SECRET`.
|
||||
3. If you're enabling authentication on meta nodes, you must also include the following configurations:
|
||||
- `INFLUXDB_META_META_AUTH_ENABLED` environment variable, or `[http]` configuration setting `meta-auth-enabled`, is set to `true`.
|
||||
This value must be the same value as the meta node's `meta.auth-enabled` configuration.
|
||||
- `INFLUXDB_META_META_INTERNAL_SHARED_SECRET`,
|
||||
or the corresponding `[meta]` configuration setting `meta-internal-shared-secret`,
|
||||
is set a secret value.
|
||||
This value must be the same value as the meta node's `meta.internal-shared-secret`.
|
||||
|
||||
### Configure meta nodes
|
||||
|
||||
Update the following settings in each meta node configuration file (`/etc/influxdb/influxdb-meta.conf`):
|
||||
|
||||
1. Configure the meta node META shared secret to validate requests using JSON web tokens (JWT) and sign each HTTP payload with the username and shared secret.
|
||||
2. Set the `[meta]` configuration setting `internal-shared-secret` to `"<internal-shared-secret>"`.
|
||||
(Or set the `INFLUXDB_META_INTERNAL_SHARED_SECRET` environment variable.)
|
||||
3. Set the `[meta]` configuration setting `meta.ldap-allowed` to `true` on all meta nodes in your cluster.
|
||||
(Or set the `INFLUXDB_META_LDAP_ALLOWED`environment variable.)
|
||||
|
||||
### Authenticate your connection to InfluxDB
|
||||
|
||||
To authenticate your InfluxDB connection, run the following command, replacing `username:password` with your credentials:
|
||||
|
||||
{{< keep-url >}}
|
||||
```bash
|
||||
curl -u username:password -XPOST "http://localhost:8086/..."
|
||||
```
|
||||
|
||||
For more detail on authentication, see [Authentication and authorization in InfluxDB](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/).
|
||||
|
||||
### Create, verify, and upload the LDAP configuration file
|
||||
|
||||
1. To create a sample LDAP configuration file, run the following command:
|
||||
|
||||
```bash
|
||||
influxd-ctl ldap sample-config
|
||||
```
|
||||
|
||||
2. Save the sample file and edit as needed for your LDAP server.
|
||||
For detail, see the [sample LDAP configuration file](#sample-ldap-configuration) below.
|
||||
|
||||
> To use fine-grained authorization (FGA) with LDAP, you must map InfluxDB Enterprise roles to key-value pairs in the LDAP database.
|
||||
For more information, see [Fine-grained authorization in InfluxDB Enterprise](/enterprise_influxdb/v1.10/guides/fine-grained-authorization/).
|
||||
The InfluxDB admin user doesn't include permissions for InfluxDB Enterprise roles.
|
||||
|
||||
3. Restart all meta and data nodes in your InfluxDB Enterprise cluster to load your updated configuration.
|
||||
|
||||
On each **meta** node, run:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
service influxdb-meta restart
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
On each **data** node, run:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
service influxdb restart
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
sudo systemctl restart influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
|
||||
4. To verify your LDAP configuration, run:
|
||||
|
||||
```bash
|
||||
influxd-ctl ldap verify -ldap-config /path/to/ldap.toml
|
||||
```
|
||||
|
||||
5. To load your LDAP configuration file, run the following command:
|
||||
|
||||
```bash
|
||||
influxd-ctl ldap set-config /path/to/ldap.toml
|
||||
```
|
||||
|
||||
|
||||
## Sample LDAP configuration
|
||||
|
||||
The following is a sample configuration file that connects to a publicly available LDAP server.
|
||||
|
||||
A `DN` ("distinguished name") uniquely identifies an entry and describes its position in the directory information tree (DIT) hierarchy.
|
||||
The DN of an LDAP entry is similar to a file path on a file system.
|
||||
`DNs` refers to multiple DN entries.
|
||||
|
||||
{{% truncate %}}
|
||||
```toml
|
||||
enabled = true
|
||||
|
||||
[[servers]]
|
||||
enabled = true
|
||||
|
||||
[[servers]]
|
||||
host = "<LDAPserver>"
|
||||
port = 389
|
||||
|
||||
# Security mode for LDAP connection to this server.
|
||||
# The recommended security is set "starttls" by default. This uses an initial unencrypted connection
|
||||
# and upgrades to TLS as the first action against the server,
|
||||
# per the LDAPv3 standard.
|
||||
# Other options are "starttls+insecure" to behave the same as starttls
|
||||
# but skip server certificate verification, or "none" to use an unencrypted connection.
|
||||
security = "starttls"
|
||||
|
||||
# Credentials to use when searching for a user or group.
|
||||
bind-dn = "cn=read-only-admin,dc=example,dc=com"
|
||||
bind-password = "password"
|
||||
|
||||
# Base DNs to use when applying the search-filter to discover an LDAP user.
|
||||
search-base-dns = [
|
||||
"dc=example,dc=com",
|
||||
]
|
||||
|
||||
# LDAP filter to discover a user's DN.
|
||||
# %s will be replaced with the provided username.
|
||||
search-filter = "(uid=%s)"
|
||||
# On Active Directory you might use "(sAMAccountName=%s)".
|
||||
|
||||
# Base DNs to use when searching for groups.
|
||||
group-search-base-dns = ["dc=example,dc=com"]
|
||||
|
||||
# LDAP filter to identify groups that a user belongs to.
|
||||
# %s will be replaced with the user's DN.
|
||||
group-membership-search-filter = "(&(objectClass=groupOfUniqueNames)(uniqueMember=%s))"
|
||||
# On Active Directory you might use "(&(objectClass=group)(member=%s))".
|
||||
|
||||
# Attribute to use to determine the "group" in the group-mappings section.
|
||||
group-attribute = "ou"
|
||||
# On Active Directory you might use "cn".
|
||||
|
||||
# LDAP filter to search for a group with a particular name.
|
||||
# This is used when warming the cache to load group membership.
|
||||
group-search-filter = "(&(objectClass=groupOfUniqueNames)(cn=%s))"
|
||||
# On Active Directory you might use "(&(objectClass=group)(cn=%s))".
|
||||
|
||||
# Attribute of a group that contains the DNs of the group's members.
|
||||
group-member-attribute = "uniqueMember"
|
||||
# On Active Directory you might use "member".
|
||||
|
||||
# Create an administrator role in InfluxDB and then log in as a member of the admin LDAP group. Only members of a group with the administrator role can complete admin tasks.
|
||||
# For example, if tesla is the only member of the `italians` group, you must log in as tesla/password.
|
||||
admin-groups = ["italians"]
|
||||
|
||||
# These two roles would have to be created by hand if you want these LDAP group memberships to do anything.
|
||||
[[servers.group-mappings]]
|
||||
group = "mathematicians"
|
||||
role = "arithmetic"
|
||||
|
||||
[[servers.group-mappings]]
|
||||
group = "scientists"
|
||||
role = "laboratory"
|
||||
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
## Troubleshoot LDAP in InfluxDB Enterprise
|
||||
|
||||
### InfluxDB Enterprise does not recognize a new LDAP server
|
||||
|
||||
If you ever replace an LDAP server with a new one, you need to update your
|
||||
InfluxDB Enterprise LDAP configuration file to point to the new server.
|
||||
However, InfluxDB Enterprise may not recognize or honor the updated configuration.
|
||||
|
||||
For InfluxDB Enterprise to recognize an LDAP configuration pointing to a new
|
||||
LDAP server, do the following:
|
||||
|
||||
{{% warn %}}
|
||||
#### Not recommended in production InfluxDB Enterprise clusters
|
||||
|
||||
Performing the following process on a production cluster may have unintended consequences.
|
||||
Moving to a new LDAP server constitutes and infrastructure change and may better
|
||||
be handled through a cluster migration.
|
||||
For assistance, reach out to [InfluxData support](https://support.influxdata.com/s/contactsupport).
|
||||
{{% /warn %}}
|
||||
|
||||
1. On each meta node, update the `auth-enabled` setting to `false` in your
|
||||
`influxdb-meta.conf` configuration file to temporarily disable authentication.
|
||||
|
||||
```toml
|
||||
[meta]
|
||||
auth-enabled = false
|
||||
```
|
||||
|
||||
2. Restart all meta nodes to load the updated configuration.
|
||||
On each meta node, run:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
service influxdb-meta restart
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
3. On each meta node, [create, verify, and upload the _new_ LDAP configuration file](#create-verify-and-upload-the-ldap-configuration-file).
|
||||
|
||||
4. On each meta node, update the `auth-enabled` setting to `true` in your `influxdb-meta.conf`
|
||||
configuration file to reenable authentication.
|
||||
|
||||
```toml
|
||||
[meta]
|
||||
auth-enabled = true
|
||||
```
|
||||
|
||||
5. Restart all meta nodes to load the updated configuration.
|
||||
On each meta node, run:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
service influxdb-meta restart
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
|
@ -0,0 +1,11 @@
|
|||
---
|
||||
title: Manage
|
||||
description: Manage security, clusters, and subscriptions in InfluxDB enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Manage
|
||||
weight: 12
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,21 @@
|
|||
---
|
||||
title: Manage InfluxDB Enterprise clusters
|
||||
description: >
|
||||
Use the `influxd-ctl` and `influx` command line tools to manage InfluxDB Enterprise clusters and data.
|
||||
aliases:
|
||||
- /enterprise/v1.8/features/cluster-commands/
|
||||
- /enterprise_influxdb/v1.10/features/cluster-commands/
|
||||
- /enterprise_influxdb/v1.10/administration/cluster-commands/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Manage clusters
|
||||
parent: Manage
|
||||
weight: 10
|
||||
---
|
||||
|
||||
Use the following tools to manage and interact with your InfluxDB Enterprise clusters:
|
||||
|
||||
- [`influxd-ctl`](/enterprise_influxdb/v1.10/tools/influxd-ctl/) cluster management utility
|
||||
- [`influx`](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) command line interface (CLI)
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,27 @@
|
|||
---
|
||||
title: Add node to existing cluster
|
||||
description: Add nodes to an existing InfluxDB Enterprise cluster.
|
||||
aliases:
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Add nodes
|
||||
parent: Manage clusters
|
||||
weight: 19
|
||||
---
|
||||
|
||||
To add a data node to an existing cluster, follow the steps below.
|
||||
|
||||
1. Install and start a new data node.
|
||||
Complete steps 1–3 of the [data node installation instructions](/enterprise_influxdb/v1.10/introduction/install-and-deploy/installation/data_node_installation/#step-1-add-appropriate-dns-entries-for-each-of-your-servers).
|
||||
2. To join the new node to the cluster, do one of the following:
|
||||
- From a meta node, run:
|
||||
```sh
|
||||
influxd-ctl add-data <new data node address>:<port>
|
||||
```
|
||||
- From a remote server, run:
|
||||
|
||||
```sh
|
||||
influxd-ctl -bind <existing_meta_node:8091> add-data <new data node
|
||||
address>:<port>
|
||||
```
|
||||
3. (Optional) [Rebalance the cluster](/enterprise_influxdb/v1.10/administration/manage/clusters/rebalance/).
|
|
@ -0,0 +1,420 @@
|
|||
---
|
||||
title: Rebalance InfluxDB Enterprise clusters
|
||||
description: Manually rebalance an InfluxDB Enterprise cluster.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/rebalance/
|
||||
- /enterprise_influxdb/v1.10/guides/rebalance/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Rebalance clusters
|
||||
parent: Manage clusters
|
||||
weight: 21
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
This guide describes how to manually rebalance an InfluxDB Enterprise cluster.
|
||||
Rebalancing a cluster involves two primary goals:
|
||||
|
||||
* Evenly distribute
|
||||
[shards](/enterprise_influxdb/v1.10/concepts/glossary/#shard) across all data nodes in the
|
||||
cluster
|
||||
* Ensure that every
|
||||
shard is on *n* number of nodes, where *n* is determined by the retention policy's
|
||||
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
|
||||
|
||||
Rebalancing a cluster is essential for cluster health.
|
||||
Perform a rebalance if you add a new data node to your cluster.
|
||||
The proper rebalance path depends on the purpose of the new data node.
|
||||
If you added a data node to expand the disk size of the cluster or increase
|
||||
write throughput, follow the steps in
|
||||
[Rebalance Procedure 1](#rebalance-procedure-1-rebalance-a-cluster-to-create-space).
|
||||
If you added a data node to increase data availability for queries and query
|
||||
throughput, follow the steps in
|
||||
[Rebalance Procedure 2](#rebalance-procedure-2-rebalance-a-cluster-to-increase-availability).
|
||||
|
||||
### Requirements
|
||||
|
||||
The following sections assume that you already added a new data node to the
|
||||
cluster, and they use the
|
||||
[`influxd-ctl` tool](/enterprise_influxdb/v1.10/tools/influxd-ctl/) available on
|
||||
all meta nodes.
|
||||
|
||||
{{% warn %}}
|
||||
Before you begin, stop writing historical data to InfluxDB.
|
||||
Historical data have timestamps that occur at anytime in the past.
|
||||
Performing a rebalance while writing historical data can lead to data loss.
|
||||
{{% /warn %}}
|
||||
|
||||
## Rebalance Procedure 1: Rebalance a cluster to create space
|
||||
|
||||
For demonstration purposes, the next steps assume that you added a third
|
||||
data node to a previously two-data-node cluster that has a
|
||||
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor) of
|
||||
two.
|
||||
This rebalance procedure is applicable for different cluster sizes and
|
||||
replication factors, but some of the specific, user-provided values will depend
|
||||
on that cluster size.
|
||||
|
||||
Rebalance Procedure 1 focuses on how to rebalance a cluster after adding a
|
||||
data node to expand the total disk capacity of the cluster.
|
||||
In the next steps, you will safely move shards from one of the two original data
|
||||
nodes to the new data node.
|
||||
|
||||
### Step 1: Truncate Hot Shards
|
||||
|
||||
Hot shards are shards that are currently receiving writes.
|
||||
Performing any action on a hot shard can lead to data inconsistency within the
|
||||
cluster which requires manual intervention from the user.
|
||||
|
||||
To prevent data inconsistency, truncate hot shards before moving any shards
|
||||
across data nodes.
|
||||
The command below creates a new hot shard which is automatically distributed
|
||||
across all data nodes in the cluster, and the system writes all new points to
|
||||
that shard.
|
||||
All previous writes are now stored in cold shards.
|
||||
|
||||
```
|
||||
influxd-ctl truncate-shards
|
||||
```
|
||||
|
||||
The expected ouput of this command is:
|
||||
|
||||
```
|
||||
Truncated shards.
|
||||
```
|
||||
|
||||
Once you truncate the shards, you can work on redistributing the cold shards
|
||||
without the threat of data inconsistency in the cluster.
|
||||
Any hot or new shards are now evenly distributed across the cluster and require
|
||||
no further intervention.
|
||||
|
||||
### Step 2: Identify Cold Shards
|
||||
|
||||
In this step, you identify the cold shards that you will copy to the new data node
|
||||
and remove from one of the original two data nodes.
|
||||
|
||||
The following command lists every shard in our cluster:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output is similar to the items in the codeblock below:
|
||||
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
21 telegraf autogen 2 [...] 2017-01-26T18:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
24 telegraf autogen 2 [...] 2017-01-26T19:00:00Z [{5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
The sample output includes three shards.
|
||||
The first two shards are cold shards.
|
||||
The timestamp in the `End` column occurs in the past (assume that the current
|
||||
time is just after `2017-01-26T18:05:36.418734949Z`), and the shards' owners
|
||||
are the two original data nodes: `enterprise-data-01:8088` and
|
||||
`enterprise-data-02:8088`.
|
||||
The second shard is the truncated shard; truncated shards have an asterix (`*`)
|
||||
on the timestamp in the `End` column.
|
||||
|
||||
The third shard is the newly-created hot shard; the timestamp in the `End`
|
||||
column is in the future (again, assume that the current time is just after
|
||||
`2017-01-26T18:05:36.418734949Z`), and the shard's owners include one of the
|
||||
original data nodes (`enterprise-data-02:8088`) and the new data node
|
||||
(`enterprise-data-03:8088`).
|
||||
That hot shard and any subsequent shards require no attention during
|
||||
the rebalance process.
|
||||
|
||||
Identify the cold shards that you'd like to move from one of the original two
|
||||
data nodes to the new data node.
|
||||
Take note of the cold shard's `ID` (for example: `22`) and the TCP address of
|
||||
one of its owners in the `Owners` column (for example:
|
||||
`enterprise-data-01:8088`).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
Use the following command string to determine the size of the shards in
|
||||
your cluster:
|
||||
>
|
||||
find /var/lib/influxdb/data/ -mindepth 3 -type d -exec du -h {} \;
|
||||
>
|
||||
In general, we recommend moving larger shards to the new data node to increase the
|
||||
available disk space on the original data nodes.
|
||||
Users should note that moving shards will impact network traffic.
|
||||
|
||||
### Step 3: Copy Cold Shards
|
||||
|
||||
Next, copy the relevant cold shards to the new data node with the syntax below.
|
||||
Repeat this command for every cold shard that you'd like to move to the
|
||||
new data node.
|
||||
|
||||
```
|
||||
influxd-ctl copy-shard <source_TCP_address> <destination_TCP_address> <shard_ID>
|
||||
```
|
||||
|
||||
Where `source_TCP_address` is the address that you noted in step 2,
|
||||
`destination_TCP_address` is the TCP address of the new data node, and `shard_ID`
|
||||
is the ID of the shard that you noted in step 2.
|
||||
|
||||
The expected output of the command is:
|
||||
```
|
||||
Copied shard <shard_ID> from <source_TCP_address> to <destination_TCP_address>
|
||||
```
|
||||
|
||||
### Step 4: Confirm the Copied Shards
|
||||
|
||||
Confirm that the TCP address of the new data node appears in the `Owners` column
|
||||
for every copied shard:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output shows that the copied shard now has three owners:
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
In addition, verify that the copied shards appear in the new data node's shard
|
||||
directory and match the shards in the source data node's shard directory.
|
||||
Shards are located in
|
||||
`/var/lib/influxdb/data/<database>/<retention_policy>/<shard_ID>`.
|
||||
|
||||
Here's an example of the correct output for shard `22`:
|
||||
```
|
||||
# On the source data node (enterprise-data-01)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
|
||||
# On the new data node (enterprise-data-03)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
```
|
||||
|
||||
It is essential that every copied shard appears on the new data node both
|
||||
in the `influxd-ctl show-shards` output and in the shard directory.
|
||||
If a shard does not pass both of the tests above, please repeat step 3.
|
||||
|
||||
### Step 5: Remove Unnecessary Cold Shards
|
||||
|
||||
Next, remove the copied shard from the original data node with the command below.
|
||||
Repeat this command for every cold shard that you'd like to remove from one of
|
||||
the original data nodes.
|
||||
**Removing a shard is an irrecoverable, destructive action; please be
|
||||
cautious with this command.**
|
||||
|
||||
```
|
||||
influxd-ctl remove-shard <source_TCP_address> <shard_ID>
|
||||
```
|
||||
|
||||
Where `source_TCP_address` is the TCP address of the original data node and
|
||||
`shard_ID` is the ID of the shard that you noted in step 2.
|
||||
|
||||
The expected output of the command is:
|
||||
```
|
||||
Removed shard <shard_ID> from <source_TCP_address>
|
||||
```
|
||||
|
||||
### Step 6: Confirm the Rebalance
|
||||
|
||||
For every relevant shard, confirm that the TCP address of the new data node and
|
||||
only one of the original data nodes appears in the `Owners` column:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output shows that the copied shard now has only two owners:
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
That's it.
|
||||
You've successfully rebalanced your cluster; you expanded the available disk
|
||||
size on the original data nodes and increased the cluster's write throughput.
|
||||
|
||||
## Rebalance Procedure 2: Rebalance a cluster to increase availability
|
||||
|
||||
For demonstration purposes, the next steps assume that you added a third
|
||||
data node to a previously two-data-node cluster that has a
|
||||
[replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor) of
|
||||
two.
|
||||
This rebalance procedure is applicable for different cluster sizes and
|
||||
replication factors, but some of the specific, user-provided values will depend
|
||||
on that cluster size.
|
||||
|
||||
Rebalance Procedure 2 focuses on how to rebalance a cluster to improve availability
|
||||
and query throughput.
|
||||
In the next steps, you will increase the retention policy's replication factor and
|
||||
safely copy shards from one of the two original data nodes to the new data node.
|
||||
|
||||
### Step 1: Update the Retention Policy
|
||||
|
||||
[Update](/enterprise_influxdb/v1.10/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy)
|
||||
every retention policy to have a replication factor of three.
|
||||
This step ensures that the system automatically distributes all newly-created
|
||||
shards across the three data nodes in the cluster.
|
||||
|
||||
The following query increases the replication factor to three.
|
||||
Run the query on any data node for each retention policy and database.
|
||||
Here, we use InfluxDB's [CLI](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) to execute the query:
|
||||
|
||||
```
|
||||
> ALTER RETENTION POLICY "<retention_policy_name>" ON "<database_name>" REPLICATION 3
|
||||
>
|
||||
```
|
||||
|
||||
A successful `ALTER RETENTION POLICY` query returns no results.
|
||||
Use the
|
||||
[`SHOW RETENTION POLICIES` query](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-retention-policies)
|
||||
to verify the new replication factor.
|
||||
|
||||
Example:
|
||||
```
|
||||
> SHOW RETENTION POLICIES ON "telegraf"
|
||||
|
||||
name duration shardGroupDuration replicaN default
|
||||
---- -------- ------------------ -------- -------
|
||||
autogen 0s 1h0m0s 3 #👍 true
|
||||
```
|
||||
|
||||
### Step 2: Truncate Hot Shards
|
||||
|
||||
Hot shards are shards that are currently receiving writes.
|
||||
Performing any action on a hot shard can lead to data inconsistency within the
|
||||
cluster which requires manual intervention from the user.
|
||||
|
||||
To prevent data inconsistency, truncate hot shards before copying any shards
|
||||
to the new data node.
|
||||
The command below creates a new hot shard which is automatically distributed
|
||||
across the three data nodes in the cluster, and the system writes all new points
|
||||
to that shard.
|
||||
All previous writes are now stored in cold shards.
|
||||
|
||||
```
|
||||
influxd-ctl truncate-shards
|
||||
```
|
||||
|
||||
The expected ouput of this command is:
|
||||
|
||||
```
|
||||
Truncated shards.
|
||||
```
|
||||
|
||||
Once you truncate the shards, you can work on distributing the cold shards
|
||||
without the threat of data inconsistency in the cluster.
|
||||
Any hot or new shards are now automatically distributed across the cluster and
|
||||
require no further intervention.
|
||||
|
||||
### Step 3: Identify Cold Shards
|
||||
|
||||
In this step, you identify the cold shards that you will copy to the new data node.
|
||||
|
||||
The following command lists every shard in your cluster:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output is similar to the items in the codeblock below:
|
||||
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
21 telegraf autogen 3 [...] 2017-01-26T18:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
22 telegraf autogen 3 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
24 telegraf autogen 3 [...] 2017-01-26T19:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
The sample output includes three shards.
|
||||
The first two shards are cold shards.
|
||||
The timestamp in the `End` column occurs in the past (assume that the current
|
||||
time is just after `2017-01-26T18:05:36.418734949Z`), and the shards' owners
|
||||
are the two original data nodes: `enterprise-data-01:8088` and
|
||||
`enterprise-data-02:8088`.
|
||||
The second shard is the truncated shard; truncated shards have an asterix (`*`)
|
||||
on the timestamp in the `End` column.
|
||||
|
||||
The third shard is the newly-created hot shard; the timestamp in the `End`
|
||||
column is in the future (again, assume that the current time is just after
|
||||
`2017-01-26T18:05:36.418734949Z`), and the shard's owners include all three
|
||||
data nodes: `enterprise-data-01:8088`, `enterprise-data-02:8088`, and
|
||||
`enterprise-data-03:8088`.
|
||||
That hot shard and any subsequent shards require no attention during
|
||||
the rebalance process.
|
||||
|
||||
Identify the cold shards that you'd like to copy from one of the original two
|
||||
data nodes to the new data node.
|
||||
Take note of the cold shard's `ID` (for example: `22`) and the TCP address of
|
||||
one of its owners in the `Owners` column (for example:
|
||||
`enterprise-data-01:8088`).
|
||||
|
||||
### Step 4: Copy Cold Shards
|
||||
|
||||
Next, copy the relevant cold shards to the new data node with the syntax below.
|
||||
Repeat this command for every cold shard that you'd like to move to the
|
||||
new data node.
|
||||
|
||||
```
|
||||
influxd-ctl copy-shard <source_TCP_address> <destination_TCP_address> <shard_ID>
|
||||
```
|
||||
|
||||
Where `source_TCP_address` is the address that you noted in step 3,
|
||||
`destination_TCP_address` is the TCP address of the new data node, and `shard_ID`
|
||||
is the ID of the shard that you noted in step 3.
|
||||
|
||||
The expected output of the command is:
|
||||
```
|
||||
Copied shard <shard_ID> from <source_TCP_address> to <destination_TCP_address>
|
||||
```
|
||||
|
||||
### Step 5: Confirm the Rebalance
|
||||
|
||||
Confirm that the TCP address of the new data node appears in the `Owners` column
|
||||
for every copied shard:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output shows that the copied shard now has three owners:
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
22 telegraf autogen 3 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
In addition, verify that the copied shards appear in the new data node's shard
|
||||
directory and match the shards in the source data node's shard directory.
|
||||
Shards are located in
|
||||
`/var/lib/influxdb/data/<database>/<retention_policy>/<shard_ID>`.
|
||||
|
||||
Here's an example of the correct output for shard `22`:
|
||||
```
|
||||
# On the source data node (enterprise-data-01)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
|
||||
# On the new data node (enterprise-data-03)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
```
|
||||
|
||||
That's it.
|
||||
You've successfully rebalanced your cluster and increased data availability for
|
||||
queries and query throughput.
|
|
@ -0,0 +1,428 @@
|
|||
---
|
||||
title: Replace InfluxDB Enterprise cluster meta nodes and data nodes
|
||||
description: Replace meta and data nodes in an InfluxDB Enterprise cluster.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/replacing-nodes/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Replace nodes
|
||||
parent: Manage clusters
|
||||
weight: 20
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Nodes in an InfluxDB Enterprise cluster may need to be replaced at some point due to hardware needs, hardware issues, or something else entirely.
|
||||
This guide outlines processes for replacing both meta nodes and data nodes in an InfluxDB Enterprise cluster.
|
||||
|
||||
## Concepts
|
||||
Meta nodes manage and monitor both the uptime of nodes in the cluster as well as distribution of [shards](/enterprise_influxdb/v1.10/concepts/glossary/#shard) among nodes in the cluster.
|
||||
They hold information about which data nodes own which shards; information on which the
|
||||
[anti-entropy](/enterprise_influxdb/v1.10/administration/anti-entropy/) (AE) process depends.
|
||||
|
||||
Data nodes hold raw time-series data and metadata. Data shards are both distributed and replicated across data nodes in the cluster. The AE process runs on data nodes and references the shard information stored in the meta nodes to ensure each data node has the shards they need.
|
||||
|
||||
`influxd-ctl` is a CLI included in each meta node and is used to manage your InfluxDB Enterprise cluster.
|
||||
|
||||
## Scenarios
|
||||
|
||||
### Replace nodes in clusters with security enabled
|
||||
Many InfluxDB Enterprise clusters are configured with security enabled, forcing secure TLS encryption between all nodes in the cluster.
|
||||
Both `influxd-ctl` and `curl`, the command line tools used when replacing nodes, have options that facilitate the use of TLS.
|
||||
|
||||
#### `influxd-ctl -bind-tls`
|
||||
In order to manage your cluster over TLS, pass the `-bind-tls` flag with any `influxd-ctl` commmand.
|
||||
|
||||
> If using a self-signed certificate, pass the `-k` flag to skip certificate verification.
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl -bind-tls [-k] <command>
|
||||
|
||||
# Example
|
||||
influxd-ctl -bind-tls remove-meta enterprise-meta-02:8091
|
||||
```
|
||||
|
||||
#### `curl -k`
|
||||
|
||||
`curl` natively supports TLS/SSL connections, but if using a self-signed certificate, pass the `-k`/`--insecure` flag to allow for "insecure" SSL connections.
|
||||
|
||||
> Self-signed certificates are considered "insecure" due to their lack of a valid chain of authority. However, data is still encrypted when using self-signed certificates.
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
curl [-k, --insecure] <url>
|
||||
|
||||
# Example
|
||||
curl -k https://localhost:8091/status
|
||||
```
|
||||
|
||||
### Replace meta nodes in a functional cluster
|
||||
|
||||
If all meta nodes in the cluster are fully functional, simply follow the steps for [replacing meta nodes](#replace-meta-nodes-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
### Replace an unresponsive meta node
|
||||
|
||||
If replacing a meta node that is either unreachable or unrecoverable, you need to forcefully remove it from the meta cluster. Instructions for forcefully removing meta nodes are provided in the [step 2.2](#2-2-remove-the-non-leader-meta-node) of the [replacing meta nodes](#replace-meta-nodes-in-an-influxdb-enterprise-cluster) process.
|
||||
|
||||
### Replace responsive and unresponsive data nodes in a cluster
|
||||
|
||||
The process of replacing both responsive and unresponsive data nodes is the same. Simply follow the instructions for [replacing data nodes](#replace-data-nodes-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
### Reconnect a data node with a failed disk
|
||||
|
||||
A disk drive failing is never a good thing, but it does happen, and when it does,
|
||||
all shards on that node are lost.
|
||||
|
||||
Often in this scenario, rather than replacing the entire host, you just need to replace the disk.
|
||||
Host information remains the same, but once started again, the `influxd` process doesn't know
|
||||
to communicate with the meta nodes so the AE process can't start the shard-sync process.
|
||||
|
||||
To resolve this, log in to a meta node and use the [`influxd-ctl update-data`](/enterprise_influxdb/v1.10/tools/influxd-ctl/#update-data) command
|
||||
to [update the failed data node to itself](#2-replace-the-old-data-node-with-the-new-data-node).
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl update-data <data-node-tcp-bind-address> <data-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl update-data enterprise-data-01:8088 enterprise-data-01:8088
|
||||
```
|
||||
|
||||
This will connect the `influxd` process running on the newly replaced disk to the cluster.
|
||||
The AE process will detect the missing shards and begin to sync data from other
|
||||
shards in the same shard group.
|
||||
|
||||
|
||||
## Replace meta nodes in an InfluxDB Enterprise cluster
|
||||
|
||||
[Meta nodes](/enterprise_influxdb/v1.10/concepts/clustering/#meta-nodes) together form a [Raft](https://raft.github.io/) cluster in which nodes elect a leader through consensus vote.
|
||||
The leader oversees the management of the meta cluster, so it is important to replace non-leader nodes before the leader node.
|
||||
The process for replacing meta nodes is as follows:
|
||||
|
||||
1. [Identify the leader node](#1-identify-the-leader-node)
|
||||
2. [Replace all non-leader nodes](#2-replace-all-non-leader-nodes)
|
||||
2.1. [Provision a new meta node](#2-1-provision-a-new-meta-node)
|
||||
2.2. [Remove the non-leader meta node](#2-2-remove-the-non-leader-meta-node)
|
||||
2.3. [Add the new meta node](#2-3-add-the-new-meta-node)
|
||||
2.4. [Confirm the meta node was added](#2-4-confirm-the-meta-node-was-added)
|
||||
2.5. [Remove and replace all other non-leader meta nodes](#2-5-remove-and-replace-all-other-non-leader-meta-nodes)
|
||||
3. [Replace the leader node](#3-replace-the-leader-node)
|
||||
3.1. [Kill the meta process on the leader node](#3-1-kill-the-meta-process-on-the-leader-node)
|
||||
3.2. [Remove and replace the old leader node](#3-2-remove-and-replace-the-old-leader-node)
|
||||
|
||||
### 1. Identify the leader node
|
||||
|
||||
Log into any of your meta nodes and run the following:
|
||||
|
||||
```bash
|
||||
curl -s localhost:8091/status | jq
|
||||
```
|
||||
|
||||
> Piping the command into `jq` is optional, but does make the JSON output easier to read.
|
||||
|
||||
The output will include information about the current meta node, the leader of the meta cluster, and a list of "peers" in the meta cluster.
|
||||
|
||||
```json
|
||||
{
|
||||
"nodeType": "meta",
|
||||
"leader": "enterprise-meta-01:8089",
|
||||
"httpAddr": "enterprise-meta-01:8091",
|
||||
"raftAddr": "enterprise-meta-01:8089",
|
||||
"peers": [
|
||||
"enterprise-meta-01:8089",
|
||||
"enterprise-meta-02:8089",
|
||||
"enterprise-meta-03:8089"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Identify the `leader` of the cluster. When replacing nodes in a cluster, non-leader nodes should be replaced _before_ the leader node.
|
||||
|
||||
### 2. Replace all non-leader nodes
|
||||
|
||||
#### 2.1. Provision a new meta node
|
||||
|
||||
[Provision and start a new meta node](/enterprise_influxdb/v1.10/installation/meta_node_installation/), but **do not** add it to the cluster yet.
|
||||
For this guide, the new meta node's hostname will be `enterprise-meta-04`.
|
||||
|
||||
#### 2.2. Remove the non-leader meta node
|
||||
|
||||
Now remove the non-leader node you are replacing by using the `influxd-ctl remove-meta` command and the TCP address of the meta node (ex. `enterprise-meta-02:8091`):
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl remove-meta <meta-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl remove-meta enterprise-meta-02:8091
|
||||
```
|
||||
|
||||
> Only use `remove-meta` if you want to permanently remove a meta node from a cluster.
|
||||
|
||||
<!-- -->
|
||||
|
||||
> **For unresponsive or unrecoverable meta nodes:**
|
||||
|
||||
>If the meta process is not running on the node you are trying to remove or the node is neither reachable nor recoverable, use the `-force` flag.
|
||||
When forcefully removing a meta node, you must also pass the `-tcpAddr` flag with the TCP and HTTP bind addresses of the node you are removing.
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl remove-meta -force -tcpAddr <meta-node-tcp-bind-address> <meta-node-http-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl remove-meta -force -tcpAddr enterprise-meta-02:8089 enterprise-meta-02:8091
|
||||
```
|
||||
|
||||
#### 2.3. Add the new meta node
|
||||
|
||||
Once the non-leader meta node has been removed, use `influxd-ctl add-meta` to replace it with the new meta node:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl add-meta <meta-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl add-meta enterprise-meta-04:8091
|
||||
```
|
||||
|
||||
You can also add a meta node remotely through another meta node:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl -bind <remote-meta-node-bind-address> add-meta <meta-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl -bind enterprise-meta-node-01:8091 add-meta enterprise-meta-node-04:8091
|
||||
```
|
||||
|
||||
>This command contacts the meta node running at `cluster-meta-node-01:8091` and adds a meta node to that meta node’s cluster.
|
||||
The added meta node has the hostname `cluster-meta-node-04` and runs on port `8091`.
|
||||
|
||||
#### 2.4. Confirm the meta node was added
|
||||
|
||||
Confirm the new meta-node has been added by running:
|
||||
|
||||
```bash
|
||||
influxd-ctl show
|
||||
```
|
||||
|
||||
The new meta node should appear in the output:
|
||||
|
||||
```bash
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 enterprise-data-01:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
5 enterprise-data-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-04:8091 {{< latest-patch >}}-c{{< latest-patch >}} # <-- The newly added meta node
|
||||
```
|
||||
|
||||
#### 2.5. Remove and replace all other non-leader meta nodes
|
||||
|
||||
**If replacing only one meta node, no further action is required.**
|
||||
If replacing others, repeat steps [2.1-2.4](#2-1-provision-a-new-meta-node) for all non-leader meta nodes one at a time.
|
||||
|
||||
### 3. Replace the leader node
|
||||
|
||||
As non-leader meta nodes are removed and replaced, the leader node oversees the replication of data to each of the new meta nodes.
|
||||
Leave the leader up and running until at least two of the new meta nodes are up, running and healthy.
|
||||
|
||||
#### 3.1 - Kill the meta process on the leader node
|
||||
|
||||
Log into the leader meta node and kill the meta process.
|
||||
|
||||
```bash
|
||||
# List the running processes and get the
|
||||
# PID of the 'influx-meta' process
|
||||
ps aux
|
||||
|
||||
# Kill the 'influx-meta' process
|
||||
kill <PID>
|
||||
```
|
||||
|
||||
Once killed, the meta cluster will elect a new leader using the [raft consensus algorithm](https://raft.github.io/).
|
||||
Confirm the new leader by running:
|
||||
|
||||
```bash
|
||||
curl localhost:8091/status | jq
|
||||
```
|
||||
|
||||
#### 3.2 - Remove and replace the old leader node
|
||||
|
||||
Remove the old leader node and replace it by following steps [2.1-2.4](#2-1-provision-a-new-meta-node).
|
||||
The minimum number of meta nodes you should have in your cluster is 3.
|
||||
|
||||
## Replace data nodes in an InfluxDB Enterprise cluster
|
||||
|
||||
[Data nodes](/enterprise_influxdb/v1.10/concepts/clustering/#data-nodes) house all raw time series data and metadata.
|
||||
The process of replacing data nodes is as follows:
|
||||
|
||||
1. [Provision a new data node](#1-provision-a-new-data-node)
|
||||
2. [Replace the old data node with the new data node](#2-replace-the-old-data-node-with-the-new-data-node)
|
||||
3. [Confirm the data node was added](#3-confirm-the-data-node-was-added)
|
||||
4. [Check the copy-shard-status](#4-check-the-copy-shard-status)
|
||||
|
||||
### 1. Provision a new data node
|
||||
|
||||
[Provision and start a new data node](/enterprise_influxdb/v1.10/installation/data_node_installation/), but **do not** add it to your cluster yet.
|
||||
|
||||
### 2. Replace the old data node with the new data node
|
||||
|
||||
Log into any of your cluster's meta nodes and use `influxd-ctl update-data` to replace the old data node with the new data node:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl update-data <old-node-tcp-bind-address> <new-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl update-data enterprise-data-01:8088 enterprise-data-03:8088
|
||||
```
|
||||
|
||||
### 3. Confirm the data node was added
|
||||
|
||||
Confirm the new data node has been added by running:
|
||||
|
||||
```bash
|
||||
influxd-ctl show
|
||||
```
|
||||
|
||||
The new data node should appear in the output:
|
||||
|
||||
```bash
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 enterprise-data-03:8088 {{< latest-patch >}}-c{{< latest-patch >}} # <-- The newly added data node
|
||||
5 enterprise-data-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
```
|
||||
|
||||
Inspect your cluster's shard distribution with `influxd-ctl show-shards`.
|
||||
Shards will immediately reflect the new address of the node.
|
||||
|
||||
```bash
|
||||
influxd-ctl show-shards
|
||||
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas Shard Group Start End Expires Owners
|
||||
3 telegraf autogen 2 2 2018-03-19T00:00:00Z 2018-03-26T00:00:00Z [{5 enterprise-data-02:8088} {4 enterprise-data-03:8088}]
|
||||
1 _internal monitor 2 1 2018-03-22T00:00:00Z 2018-03-23T00:00:00Z 2018-03-30T00:00:00Z [{5 enterprise-data-02:8088}]
|
||||
2 _internal monitor 2 1 2018-03-22T00:00:00Z 2018-03-23T00:00:00Z 2018-03-30T00:00:00Z [{4 enterprise-data-03:8088}]
|
||||
4 _internal monitor 2 3 2018-03-23T00:00:00Z 2018-03-24T00:00:00Z 2018-03-01T00:00:00Z [{5 enterprise-data-02:8088}]
|
||||
5 _internal monitor 2 3 2018-03-23T00:00:00Z 2018-03-24T00:00:00Z 2018-03-01T00:00:00Z [{4 enterprise-data-03:8088}]
|
||||
6 foo autogen 2 4 2018-03-19T00:00:00Z 2018-03-26T00:00:00Z [{5 enterprise-data-02:8088} {4 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
Within the duration defined by [`anti-entropy.check-interval`](/enterprise_influxdb/v1.10/administration/config-data-nodes#check-interval-10m),
|
||||
the AE service will begin copying shards from other shard owners to the new node.
|
||||
The time it takes for copying to complete is determined by the number of shards copied and how much data is stored in each.
|
||||
|
||||
### 4. Check the `copy-shard-status`
|
||||
|
||||
Check on the status of the copy-shard process with:
|
||||
|
||||
```bash
|
||||
influxd-ctl copy-shard-status
|
||||
```
|
||||
|
||||
The output will show all currently running copy-shard processes.
|
||||
|
||||
```bash
|
||||
Source Dest Database Policy ShardID TotalSize CurrentSize StartedAt
|
||||
enterprise-data-02:8088 enterprise-data-03:8088 telegraf autogen 3 119624324 119624324 2018-04-17 23:45:09.470696179 +0000 UTC
|
||||
```
|
||||
|
||||
> **Important:** If replacing other data nodes in the cluster, make sure shards are completely copied from nodes in the same shard group before replacing the other nodes.
|
||||
View the [Anti-entropy](/enterprise_influxdb/v1.10/administration/anti-entropy/#concepts) documentation for important information regarding anti-entropy and your database's replication factor.
|
||||
|
||||
## Troubleshoot
|
||||
|
||||
### Cluster commands result in timeout without error
|
||||
|
||||
In some cases, commands used to add or remove nodes from your cluster
|
||||
timeout, but don't return an error.
|
||||
|
||||
```bash
|
||||
add-data: operation timed out with error:
|
||||
```
|
||||
|
||||
#### Check your InfluxDB user permissions
|
||||
|
||||
In order to add or remove nodes to or from a cluster, your user must have `AddRemoveNode` permissions.
|
||||
Attempting to manage cluster nodes without the appropriate permissions results
|
||||
in a timeout with no accompanying error.
|
||||
|
||||
To check user permissions, log in to one of your meta nodes and `curl` the `/user` API endpoint:
|
||||
|
||||
```bash
|
||||
curl localhost:8091/user
|
||||
```
|
||||
|
||||
You can also check the permissions of a specific user by passing the username with the `name` parameter:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
curl localhost:8091/user?name=<username>
|
||||
|
||||
# Example
|
||||
curl localhost:8091/user?name=bob
|
||||
```
|
||||
|
||||
The JSON output will include user information and permissions:
|
||||
|
||||
```json
|
||||
"users": [
|
||||
{
|
||||
"name": "bob",
|
||||
"hash": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
|
||||
"permissions": {
|
||||
"": [
|
||||
"ViewAdmin",
|
||||
"ViewChronograf",
|
||||
"CreateDatabase",
|
||||
"CreateUserAndRole",
|
||||
"DropDatabase",
|
||||
"DropData",
|
||||
"ReadData",
|
||||
"WriteData",
|
||||
"ManageShard",
|
||||
"ManageContinuousQuery",
|
||||
"ManageQuery",
|
||||
"ManageSubscription",
|
||||
"Monitor"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
_In the output above, `bob` does not have the required `AddRemoveNode` permissions
|
||||
and would not be able to add or remove nodes from the cluster._
|
||||
|
||||
#### Check the network connection between nodes
|
||||
|
||||
Something may be interrupting the network connection between nodes.
|
||||
To check, `ping` the server or node you're trying to add or remove.
|
||||
If the ping is unsuccessful, something in the network is preventing communication.
|
||||
|
||||
```bash
|
||||
ping enterprise-data-03:8088
|
||||
```
|
||||
|
||||
_If pings are unsuccessful, be sure to ping from other meta nodes as well to determine
|
||||
if the communication issues are unique to specific nodes._
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
title: Rename hosts in InfluxDB Enterprise
|
||||
description: Rename a host within your InfluxDB Enterprise instance.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/renaming/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Rename hosts
|
||||
parent: Manage
|
||||
weight: 40
|
||||
---
|
||||
|
||||
## Host renaming
|
||||
|
||||
The following instructions allow you to rename a host within your InfluxDB Enterprise instance.
|
||||
|
||||
First, suspend write and query activity to the cluster.
|
||||
|
||||
### Rename meta nodes
|
||||
|
||||
- Find the meta node leader with `curl localhost:8091/status`. The `leader` field in the JSON output reports the leader meta node. We will start with the two meta nodes that are not leaders.
|
||||
- On a non-leader meta node, run `influxd-ctl remove-meta`. Once removed, confirm by running `influxd-ctl show` on the meta leader.
|
||||
- Stop the meta service on the removed node, edit its configuration file to set the new "hostname" under "/etc/influxdb/influxdb-meta.conf".
|
||||
- Update the actual OS host's name if needed, apply DNS changes.
|
||||
- Start the meta service.
|
||||
- On the meta leader, add the meta node with the new hostname using `influxd-ctl add-meta newmetanode:8091`. Confirm with `influxd-ctl show`
|
||||
- Repeat for the second meta node.
|
||||
- Once the two non-leaders are updated, stop the leader and wait for another meta node to become the leader - check with `curl localhost:8091/status`.
|
||||
- Repeat the process for the last meta node (former leader).
|
||||
|
||||
### Intermediate verification
|
||||
|
||||
- Verify the state of the cluster with `influxd-ctl show`. The version must be reported on all nodes for them to be healthy.
|
||||
- Verify there is a meta leader with `curl localhost:8091/status` and that all meta nodes list the rest in the output.
|
||||
- Restart all data nodes one by one. Verify that `/var/lib/influxdb/meta/client.json` on all data nodes references the new meta names.
|
||||
- Verify the `show shards` output lists all shards and node ownership as expected.
|
||||
- Verify that the cluster is in good shape functional-wise, responds to writes and queries.
|
||||
|
||||
### Rename data nodes
|
||||
|
||||
- Find the meta node leader with `curl localhost:8091/status`. The `leader` field in the JSON output reports the leader meta node.
|
||||
- Stop the service on the data node you want to rename. Edit its configuration file to set the new `hostname` under `/etc/influxdb/influxdb.conf`.
|
||||
- Update the actual OS host's name if needed, apply DNS changes.
|
||||
- Start the data service. Errors will be logged until it is added to the cluster again.
|
||||
- On the meta node leader, run `influxd-ctl update-data oldname:8088 newname:8088`. Upon success you will get a message updated data node ID to `newname:8088`.
|
||||
- Verify with `influxd-ctl show` on the meta node leader. Verify there are no errors in the logs of the updated data node and other data nodes. Restart the service on the updated data node. Verify writes, replication and queries work as expected.
|
||||
- Repeat on the remaining data nodes. Remember to only execute the `update-data` command from the meta leader.
|
||||
|
||||
### Final verification
|
||||
|
||||
- Verify the state of the cluster with `influxd-ctl show`. The version must be reported on all nodes for them to be healthy.
|
||||
- Verify the `show shards` output lists all shards and node ownership as expected.
|
||||
- Verify meta queries work (show measurements under a database).
|
||||
- Verify data are being queried successfully.
|
||||
|
||||
Once you've performed the verification steps, resume write and query activity.
|
|
@ -0,0 +1,210 @@
|
|||
---
|
||||
title: Manage subscriptions in InfluxDB
|
||||
description: >
|
||||
Manage subscriptions, which copy all written data to a local or remote endpoint, in InfluxDB OSS.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Manage subscriptions
|
||||
parent: Manage
|
||||
weight: 30
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/subscription-management/
|
||||
---
|
||||
|
||||
InfluxDB subscriptions are local or remote endpoints to which all data written to InfluxDB is copied.
|
||||
Subscriptions are primarily used with [Kapacitor](/kapacitor/), but any endpoint
|
||||
able to accept UDP, HTTP, or HTTPS connections can subscribe to InfluxDB and receive
|
||||
a copy of all data as it is written.
|
||||
|
||||
## How subscriptions work
|
||||
|
||||
As data is written to InfluxDB, writes are duplicated to subscriber endpoints via
|
||||
HTTP, HTTPS, or UDP in [line protocol](/enterprise_influxdb/v1.10/write_protocols/line_protocol_tutorial/).
|
||||
the InfluxDB subscriber service creates multiple "writers" ([goroutines](https://golangbot.com/goroutines/))
|
||||
which send writes to the subscription endpoints.
|
||||
|
||||
_The number of writer goroutines is defined by the [`write-concurrency`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#write-concurrency-40) configuration._
|
||||
|
||||
As writes occur in InfluxDB, each subscription writer sends the written data to the
|
||||
specified subscription endpoints.
|
||||
However, with a high `write-concurrency` (multiple writers) and a high ingest rate,
|
||||
nanosecond differences in writer processes and the transport layer can result
|
||||
in writes being received out of order.
|
||||
|
||||
> #### Important information about high write loads
|
||||
> While setting the subscriber `write-concurrency` to greater than 1 does increase your
|
||||
> subscriber write throughput, it can result in out-of-order writes under high ingest rates.
|
||||
> Setting `write-concurrency` to 1 ensures writes are passed to subscriber endpoints sequentially,
|
||||
> but can create a bottleneck under high ingest rates.
|
||||
>
|
||||
> What `write-concurrency` should be set to depends on your specific workload
|
||||
> and need for in-order writes to your subscription endpoint.
|
||||
|
||||
## InfluxQL subscription statements
|
||||
|
||||
Use the following InfluxQL statements to manage subscriptions:
|
||||
|
||||
[`CREATE SUBSCRIPTION`](#create-subscriptions)
|
||||
[`SHOW SUBSCRIPTIONS`](#show-subscriptions)
|
||||
[`DROP SUBSCRIPTION`](#remove-subscriptions)
|
||||
|
||||
## Create subscriptions
|
||||
|
||||
Create subscriptions using the `CREATE SUBSCRIPTION` InfluxQL statement.
|
||||
Specify the subscription name, the database name and retention policy to subscribe to,
|
||||
and the URL of the host to which data written to InfluxDB should be copied.
|
||||
|
||||
```sql
|
||||
-- Pattern:
|
||||
CREATE SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>" DESTINATIONS <ALL|ANY> "<subscription_endpoint_host>"
|
||||
|
||||
-- Examples:
|
||||
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to 'example.com:9090' via HTTP.
|
||||
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://example.com:9090'
|
||||
|
||||
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that round-robins the data to 'h1.example.com:9090' and 'h2.example.com:9090' via UDP.
|
||||
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ANY 'udp://h1.example.com:9090', 'udp://h2.example.com:9090'
|
||||
```
|
||||
In case authentication is enabled on the subscriber host, adapt the URL to contain the credentials.
|
||||
|
||||
```
|
||||
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to another InfluxDB on 'example.com:8086' via HTTP. Authentication is enabled on the subscription host (user: subscriber, pass: secret).
|
||||
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://subscriber:secret@example.com:8086'
|
||||
```
|
||||
|
||||
{{% warn %}}
|
||||
`SHOW SUBSCRIPTIONS` outputs all subscriber URL in plain text, including those with authentication credentials.
|
||||
Any user with the privileges to run `SHOW SUBSCRIPTIONS` is able to see these credentials.
|
||||
{{% /warn %}}
|
||||
|
||||
### Sending subscription data to multiple hosts
|
||||
|
||||
The `CREATE SUBSCRIPTION` statement allows you to specify multiple hosts as endpoints for the subscription.
|
||||
In your `DESTINATIONS` clause, you can pass multiple host strings separated by commas.
|
||||
Using `ALL` or `ANY` in the `DESTINATIONS` clause determines how InfluxDB writes data to each endpoint:
|
||||
|
||||
`ALL`: Writes data to all specified hosts.
|
||||
|
||||
`ANY`: Round-robins writes between specified hosts.
|
||||
|
||||
_**Subscriptions with multiple hosts**_
|
||||
|
||||
```sql
|
||||
-- Write all data to multiple hosts
|
||||
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ALL 'http://host1.example.com:9090', 'http://host2.example.com:9090'
|
||||
|
||||
-- Round-robin writes between multiple hosts
|
||||
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ANY 'http://host1.example.com:9090', 'http://host2.example.com:9090'
|
||||
```
|
||||
|
||||
### Subscription protocols
|
||||
|
||||
Subscriptions can use HTTP, HTTPS, or UDP transport protocols.
|
||||
Which to use is determined by the protocol expected by the subscription endpoint.
|
||||
If creating a Kapacitor subscription, this is defined by the `subscription-protocol`
|
||||
option in the `[[influxdb]]` section of your [`kapacitor.conf`](/{{< latest "kapacitor" >}}/administration/subscription-management/#subscription-protocol).
|
||||
|
||||
_**kapacitor.conf**_
|
||||
|
||||
```toml
|
||||
[[influxdb]]
|
||||
|
||||
# ...
|
||||
|
||||
subscription-protocol = "http"
|
||||
|
||||
# ...
|
||||
|
||||
```
|
||||
|
||||
_For information regarding HTTPS connections and secure communication between InfluxDB and Kapacitor,
|
||||
view the [Kapacitor security](/kapacitor/v1.5/administration/security/#secure-influxdb-and-kapacitor) documentation._
|
||||
|
||||
## Show subscriptions
|
||||
|
||||
The `SHOW SUBSCRIPTIONS` InfluxQL statement returns a list of all subscriptions registered in InfluxDB.
|
||||
|
||||
```sql
|
||||
SHOW SUBSCRIPTIONS
|
||||
```
|
||||
|
||||
_**Example output:**_
|
||||
|
||||
```bash
|
||||
name: _internal
|
||||
retention_policy name mode destinations
|
||||
---------------- ---- ---- ------------
|
||||
monitor kapacitor-39545771-7b64-4692-ab8f-1796c07f3314 ANY [http://localhost:9092]
|
||||
```
|
||||
|
||||
## Remove subscriptions
|
||||
|
||||
Remove or drop subscriptions using the `DROP SUBSCRIPTION` InfluxQL statement.
|
||||
|
||||
```sql
|
||||
-- Pattern:
|
||||
DROP SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>"
|
||||
|
||||
-- Example:
|
||||
DROP SUBSCRIPTION "sub0" ON "mydb"."autogen"
|
||||
```
|
||||
|
||||
### Drop all subscriptions
|
||||
|
||||
In some cases, it may be necessary to remove all subscriptions.
|
||||
Run the following bash script that utilizes the `influx` CLI, loops through all subscriptions, and removes them.
|
||||
This script depends on the `$INFLUXUSER` and `$INFLUXPASS` environment variables.
|
||||
If these are not set, export them as part of the script.
|
||||
|
||||
```bash
|
||||
# Environment variable exports:
|
||||
# Uncomment these if INFLUXUSER and INFLUXPASS are not already globally set.
|
||||
# export INFLUXUSER=influxdb-username
|
||||
# export INFLUXPASS=influxdb-password
|
||||
|
||||
IFS=$'\n'; for i in $(influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute 'show subscriptions' | tail -n +2 | grep -v name); do influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute "drop subscription \"$(echo "$i" | cut -f 3 -d ',')\" ON \"$(echo "$i" | cut -f 1 -d ',')\".\"$(echo "$i" | cut -f 2 -d ',')\""; done
|
||||
```
|
||||
|
||||
## Configure InfluxDB subscriptions
|
||||
|
||||
InfluxDB subscription configuration options are available in the `[subscriber]`
|
||||
section of the `influxdb.conf`.
|
||||
In order to use subcriptions, the `enabled` option in the `[subscriber]` section must be set to `true`.
|
||||
Below is an example `influxdb.conf` subscriber configuration:
|
||||
|
||||
```toml
|
||||
[subscriber]
|
||||
enabled = true
|
||||
http-timeout = "30s"
|
||||
insecure-skip-verify = false
|
||||
ca-certs = ""
|
||||
write-concurrency = 40
|
||||
write-buffer-size = 1000
|
||||
```
|
||||
|
||||
_**Descriptions of `[subscriber]` configuration options are available in the [data node configuration](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#subscription-settings) documentation.**_
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Inaccessible or decommissioned subscription endpoints
|
||||
|
||||
Unless a subscription is [dropped](#remove-subscriptions), InfluxDB assumes the endpoint
|
||||
should always receive data and will continue to attempt to send data.
|
||||
If an endpoint host is inaccessible or has been decommissioned, you will see errors
|
||||
similar to the following:
|
||||
|
||||
```bash
|
||||
# Some message content omitted (...) for the sake of brevity
|
||||
"Post http://x.y.z.a:9092/write?consistency=...: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" ... service=subscriber
|
||||
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp x.y.z.a:9092: getsockopt: connection refused" ... service=subscriber
|
||||
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp 172.31.36.5:9092: getsockopt: no route to host" ... service=subscriber
|
||||
```
|
||||
|
||||
In some cases, this may be caused by a networking error or something similar
|
||||
preventing a successful connection to the subscription endpoint.
|
||||
In other cases, it's because the subscription endpoint no longer exists and
|
||||
the subscription hasn't been dropped from InfluxDB.
|
||||
|
||||
> Because InfluxDB does not know if a subscription endpoint will or will not become accessible again,
|
||||
> subscriptions are not automatically dropped when an endpoint becomes inaccessible.
|
||||
> If a subscription endpoint is removed, you must manually [drop the subscription](#remove-subscriptions) from InfluxDB.
|
|
@ -0,0 +1,18 @@
|
|||
---
|
||||
title: Manage users and permissions
|
||||
description: Manage authorization in InfluxDB Enterprise clusters with users, roles, and permissions.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Manage users and permissions
|
||||
parent: Manage
|
||||
weight: 20
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/authentication_and_authorization/
|
||||
---
|
||||
|
||||
{{% enterprise-warning-authn-b4-authz %}}
|
||||
|
||||
_For information about how to configure HTTPs over TLS, LDAP authentication, and password hashing,
|
||||
see [Configure security](/enterprise_influxdb/v1.10/administration/configure/security/)._
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,410 @@
|
|||
---
|
||||
title: Manage authorization with the InfluxDB Enterprise Meta API
|
||||
description: >
|
||||
Manage users and permissions with the InfluxDB Enterprise Meta API.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Manage authorization with the API
|
||||
parent: Manage users and permissions
|
||||
weight: 41
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/manage/security/authentication_and_authorization-api/
|
||||
- /enterprise_influxdb/v1.10/administration/security/authentication_and_authorization-api/
|
||||
---
|
||||
|
||||
{{% enterprise-warning-authn-b4-authz %}}
|
||||
|
||||
Use the InfluxDB Enterprise Meta API to manage authorization for a cluster.
|
||||
|
||||
The API can be used to manage both cluster-wide and database-specific [permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
|
||||
Chronograf can only manage cluster-wide permissions.
|
||||
To manage permissions at the database level, use the API.
|
||||
|
||||
<!--
|
||||
## permission "tokens"
|
||||
Predefined key tokens take the form of verb-object pairs.
|
||||
When the token lacks the verb part, full management privileges are implied.
|
||||
These predefined tokens are:
|
||||
-->
|
||||
|
||||
For more information, see [Enterprise users and permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/).
|
||||
|
||||
### Example API requests
|
||||
|
||||
{{% note %}}
|
||||
Many of the examples below use the `jq` utility to format JSON output for readability.
|
||||
[Install `jq`](https://stedolan.github.io/jq/download/) to process JSON output.
|
||||
If you don’t have access to `jq`, remove the `| jq` shown in the example.
|
||||
{{% /note %}}
|
||||
|
||||
**Users**:
|
||||
|
||||
- [List users](#list-users)
|
||||
- [Create a user against a follower node](#create-a-user-against-a-follower-node)
|
||||
- [Create a user against the lead node](#create-a-user-against-the-lead-node)
|
||||
- [Retrieve a user details document](#retrieve-a-user-details-document)
|
||||
- [Grant permissions to a user for all databases](#grant-permissions-to-a-user-for-all-databases)
|
||||
- [Grant permissions to a user for a specific database](#grant-permissions-to-a-user-for-a-specific-database)
|
||||
- [Verify user permissions](#verify-user-permissions)
|
||||
- [Remove permissions from a user](#remove-permissions-from-a-user)
|
||||
- [Remove a user](#remove-a-user)
|
||||
- [Verify user removal](#verify-user-removal)
|
||||
- [Change a user's password](#change-a-users-password)
|
||||
|
||||
**Roles**:
|
||||
|
||||
- [List roles](#list-roles)
|
||||
- [Create a role](#create-a-role)
|
||||
- [Verify roles](#verify-roles)
|
||||
- [Retrieve a role document](#retrieve-a-role-document)
|
||||
- [Add permissions to a role for all databases](#add-permissions-to-a-role-for-all-databases)
|
||||
- [Add permissions to a role for a specific database](#add-permissions-to-a-role-for-a-specific-database)
|
||||
- [Verify role permissions](#verify-role-permissions)
|
||||
- [Add a user to a role](#add-a-user-to-a-role)
|
||||
- [Verify user in role](#verify-user-in-role)
|
||||
- [Remove a user from a role](#remove-a-user-from-a-role)
|
||||
- [Remove a permission from a role](#remove-a-permission-from-a-role)
|
||||
- [Delete a role](#delete-a-role)
|
||||
- [Verify role deletion](#verify-role-deletion)
|
||||
|
||||
#### Users
|
||||
|
||||
Use the `/user` endpoint of the InfluxDB Enterprise Meta API to manage users.
|
||||
|
||||
##### List users
|
||||
View a list of existing users.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -u "admin:changeit" -s https://cluster_node_1:8091/user | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"users": [
|
||||
{
|
||||
"hash": "$2a$10$NelNfrWdxubN0/TnP7DwquKB9/UmJnyZ7gy0i69MPldK73m.2WfCu",
|
||||
"name": "admin",
|
||||
"permissions": {
|
||||
"": [
|
||||
"ViewAdmin",
|
||||
"ViewChronograf",
|
||||
"CreateDatabase",
|
||||
"CreateUserAndRole",
|
||||
"AddRemoveNode",
|
||||
"DropDatabase",
|
||||
"DropData",
|
||||
"ReadData",
|
||||
"WriteData",
|
||||
"Rebalance",
|
||||
"ManageShard",
|
||||
"ManageContinuousQuery",
|
||||
"ManageQuery",
|
||||
"ManageSubscription",
|
||||
"Monitor",
|
||||
"CopyShard",
|
||||
"KapacitorAPI",
|
||||
"KapacitorConfigAPI"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
##### Create a user against a follower node
|
||||
|
||||
Transactions that modify the user store must be sent to the lead meta node using `POST`.
|
||||
|
||||
If the node returns a 307 redirect message,
|
||||
try resending the request to the lead node as indicated by the `Location` field in the HTTP response header.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -u "admin:changeit" -s -v \
|
||||
-d '{"action":"create","user":{"name":"phantom2","password":"changeit"}}' \
|
||||
https://cluster_node_2:8091/user
|
||||
```
|
||||
|
||||
##### Create a user against the lead node
|
||||
|
||||
```sh
|
||||
curl --location-trusted -u "admin:changeit" -s -v \
|
||||
-d '{"action":"create","user":{"name":"phantom","password":"changeit"}}' \
|
||||
https://cluster_node_1:8091/user
|
||||
```
|
||||
|
||||
##### Retrieve a user details document
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/user?name=phantom | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"users": [
|
||||
{
|
||||
"hash": "$2a$10$hR.Ih6DpIHUaynA.uqFhpOiNUgrADlwg3rquueHDuw58AEd7zk5hC",
|
||||
"name": "phantom"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
##### Grant permissions to a user for all databases
|
||||
|
||||
To grant a list of permissions for all databases in a cluster,
|
||||
use the `""` key in the permissions object, as shown in the example below.
|
||||
|
||||
```
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"add-permissions","user":{"name":"phantom","permissions":{"":["ReadData", "WriteData"]}}}' \
|
||||
https://cluster_node_1:8091/user
|
||||
```
|
||||
|
||||
##### Grant permissions to a user for a specific database
|
||||
|
||||
Grant `ReadData` and `WriteData` permissions to the user named `phantom` for `MyDatabase`.
|
||||
|
||||
```
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"add-permissions","user":{"name":"phantom","permissions":{"MyDatabase":["ReadData","WriteData"]}}}' \
|
||||
https://cluster_node_1:8091/user
|
||||
```
|
||||
|
||||
##### Verify user permissions
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/user?name=phantom | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"users": [
|
||||
{
|
||||
"hash": "$2a$10$hR.Ih6DpIHUaynA.uqFhpOiNUgrADlwg3rquueHDuw58AEd7zk5hC",
|
||||
"name": "phantom",
|
||||
"permissions": {
|
||||
"MyDatabase": [
|
||||
"ReadData",
|
||||
"WriteData"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
##### Remove permissions from a user
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"remove-permissions","user":{"name":"phantom","permissions":{"":["KapacitorConfigAPI"]}}}' \
|
||||
https://cluster_node_1:8091/user
|
||||
```
|
||||
|
||||
##### Remove a user
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"delete","user":{"name":"phantom2"}}' \
|
||||
https://cluster_node_1:8091/user
|
||||
```
|
||||
|
||||
##### Verify user removal
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/user?name=phantom
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"error": "user not found"
|
||||
}
|
||||
```
|
||||
|
||||
##### Change a user's password
|
||||
|
||||
```sh
|
||||
curl --location-trusted -u "admin:changeit" -H "Content-Type: application/json" \
|
||||
-d '{"action": "change-password", "user": {"name": "<username>", "password": "newpassword"}}' \
|
||||
localhost:8091/user
|
||||
```
|
||||
|
||||
<!-- TODO -->
|
||||
|
||||
#### Roles
|
||||
|
||||
The Influxd-Meta API provides an endpoint `/role` for managing roles.
|
||||
|
||||
##### List roles
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role | jq
|
||||
```
|
||||
|
||||
```
|
||||
{}
|
||||
```
|
||||
|
||||
In a fresh installation no roles will have been created yet.
|
||||
As when creating a user the lead node must be used.
|
||||
|
||||
##### Create a role
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -v \
|
||||
-d '{"action":"create","role":{"name":"spectre"}}' \
|
||||
https://cluster_node_1:8091/role
|
||||
```
|
||||
|
||||
##### Verify roles
|
||||
Verify the role has been created.
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"roles": [
|
||||
{
|
||||
"name": "djinn",
|
||||
},
|
||||
{
|
||||
"name": "spectre"
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
##### Retrieve a role document
|
||||
Retrieve a record for a single node.
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"roles": [
|
||||
{
|
||||
"name": "spectre"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
##### Add permissions to a role for all databases
|
||||
|
||||
To grant a list of permissions to a role for all databases in a cluster,
|
||||
use the `""` key in the permissions object, as shown in the example below.
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"add-permissions","role":{"name":"spectre","permissions":{"":["ReadData","WriteData"]}}}' \
|
||||
https://cluster_node_1:8091/role
|
||||
```
|
||||
|
||||
|
||||
##### Add permissions to a role for a specific database
|
||||
|
||||
Grant `ReadData` and `WriteData` permissions to the role named `spectre` for `MyDatabase`.
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"add-permissions","role":{"name":"spectre","permissions":{"MyDatabase":["ReadData","WriteData"]}}}' \
|
||||
https://cluster_node_1:8091/role
|
||||
```
|
||||
|
||||
##### Verify role permissions
|
||||
Verify permissions have been added.
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"roles": [
|
||||
{
|
||||
"name": "spectre",
|
||||
"permissions": {
|
||||
"MyDatabase": [
|
||||
"ReadData",
|
||||
"WriteData"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
##### Add a user to a role
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"add-users","role":{"name":"spectre","users":["phantom"]}}' \
|
||||
https://cluster_node_1:8091/role
|
||||
```
|
||||
|
||||
##### Verify user in role
|
||||
Verify user has been added to role.
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"roles": [
|
||||
{
|
||||
"name": "spectre",
|
||||
"permissions": {
|
||||
"": [
|
||||
"KapacitorAPI",
|
||||
"KapacitorConfigAPI"
|
||||
]
|
||||
},
|
||||
"users": [
|
||||
"phantom"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
##### Remove a user from a role
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"remove-users","role":{"name":"spectre","users":["phantom"]}}' \
|
||||
https://admin:changeit@cluster_node_1:8091/role
|
||||
```
|
||||
|
||||
##### Remove a permission from a role
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"remove-permissions","role":{"name":"spectre","permissions":{"":["KapacitorConfigAPI"]}}}' \
|
||||
https://cluster_node_1:8091/role
|
||||
```
|
||||
|
||||
##### Delete a role
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s -v \
|
||||
-d '{"action":"delete","role":{"name":"spectre"}}' \
|
||||
https://cluster_node_1:8091/role
|
||||
```
|
||||
|
||||
##### Verify role deletion
|
||||
|
||||
```sh
|
||||
curl --location-trusted --negotiate -u "admin:changeit" -s https://cluster_node_1:8091/role?name=spectre | jq
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"error": "role not found"
|
||||
}
|
||||
```
|
|
@ -0,0 +1,255 @@
|
|||
---
|
||||
title: Manage authorization with InfluxQL
|
||||
description: >
|
||||
Manage users and permissions with InfluxQL.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Manage users and permissions
|
||||
weight: 40
|
||||
related:
|
||||
- /enterprise_influxdb/v1.10/administration/manage/security/authorization-api.md
|
||||
- /{{< latest "chronograf" >}}/administration/managing-influxdb-users/
|
||||
- /enterprise_influxdb/v1.10/administration/manage/security/fine-grained-authorization/
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/manage/security/authentication_and_authorization-api/
|
||||
---
|
||||
|
||||
{{% enterprise-warning-authn-b4-authz %}}
|
||||
|
||||
{{% note %}}
|
||||
We recommend using [Chronograf](/{{< latest "chronograf" >}}/administration/managing-influxdb-users/)
|
||||
and/or the [Enterprise meta API](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api/)
|
||||
to manage InfluxDB Enterprise users and roles.
|
||||
{{% /note %}}
|
||||
|
||||
{{% warn %}}
|
||||
Outside of [creating users](/enterprise_influxdb/v1.10/query_language/spec/#create-user),
|
||||
we recommend operators *do not* mix and match InfluxQL
|
||||
with other authorization management methods (Chronograf and the API).
|
||||
Doing so may lead to inconsistencies in user permissions.
|
||||
{{% /warn %}}
|
||||
|
||||
This page shows examples of basic user and permission management using InfluxQL statements.
|
||||
However, *only a subset of Enterprise permissions can be managed with InfluxQL.*
|
||||
Using InfluxQL, you can perform the following actions:
|
||||
|
||||
- Create new users and assign them either the admin role (or no role).
|
||||
- grant `READ` and/or `WRITE` permissions to users. (`READ`, `WRITE`, `ALL`)
|
||||
- `REVOKE` permissions from users.
|
||||
- `GRANT` or `REVOKE` specific database access to individual users.
|
||||
|
||||
However, InfluxDB Enterprise offers an [*expanded set of permissions*](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
|
||||
You can use the Meta API and Chronograf to access and assign these more granular permissions to individual users.
|
||||
|
||||
The [InfluxDB Enterprise meta API](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api/)
|
||||
provides the most comprehensive way to manage users, roles, permission
|
||||
and other [fine grained authorization](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/fine-grained-authorization/) (FGA) capabilities.
|
||||
|
||||
#### Non-admin users
|
||||
|
||||
When authentication is enabled,
|
||||
a new non-admin user has no access to any database
|
||||
until they are specifically [granted privileges to a database](#grant-read-write-or-all-database-privileges-to-an-existing-user)
|
||||
by an admin user.
|
||||
|
||||
Non-admin users can [`SHOW`](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-databases)
|
||||
the databases for which they have `ReadData` or `WriteData` permissions.
|
||||
|
||||
### User management commands
|
||||
|
||||
User management commands apply to either
|
||||
[admin users](#manage-admin-users),
|
||||
[non-admin users](#manage-non-admin-users),
|
||||
or [both](#manage-admin-and-non-admin-users).
|
||||
|
||||
For more information about these commands,
|
||||
see [Database management](/enterprise_influxdb/v1.10/query_language/manage-database/) and
|
||||
[Continuous queries](/enterprise_influxdb/v1.10/query_language/continuous_queries/).
|
||||
|
||||
#### Manage admin users
|
||||
|
||||
Create an admin user with:
|
||||
|
||||
```sql
|
||||
CREATE USER admin WITH PASSWORD '<password>' WITH ALL PRIVILEGES
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Repeating the exact `CREATE USER` statement is idempotent.
|
||||
If any values change the database will return a duplicate user error.
|
||||
|
||||
```sql
|
||||
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
|
||||
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
|
||||
> CREATE USER todd WITH PASSWORD '123' WITH ALL PRIVILEGES
|
||||
ERR: user already exists
|
||||
> CREATE USER todd WITH PASSWORD '123456'
|
||||
ERR: user already exists
|
||||
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
|
||||
>
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
##### `GRANT` administrative privileges to an existing user
|
||||
```sql
|
||||
GRANT ALL PRIVILEGES TO <username>
|
||||
```
|
||||
|
||||
##### `REVOKE` administrative privileges from an admin user
|
||||
```sql
|
||||
REVOKE ALL PRIVILEGES FROM <username>
|
||||
```
|
||||
|
||||
##### `SHOW` all existing users and their admin status
|
||||
```sql
|
||||
SHOW USERS
|
||||
```
|
||||
|
||||
###### CLI Example
|
||||
```sql
|
||||
> SHOW USERS
|
||||
user admin
|
||||
todd false
|
||||
paul true
|
||||
hermione false
|
||||
dobby false
|
||||
```
|
||||
|
||||
#### Manage non-admin users
|
||||
|
||||
##### `CREATE` a new non-admin user
|
||||
```sql
|
||||
CREATE USER <username> WITH PASSWORD '<password>'
|
||||
```
|
||||
|
||||
###### CLI example
|
||||
```js
|
||||
> CREATE USER todd WITH PASSWORD 'influxdb41yf3'
|
||||
> CREATE USER alice WITH PASSWORD 'wonder\'land'
|
||||
> CREATE USER "rachel_smith" WITH PASSWORD 'asdf1234!'
|
||||
> CREATE USER "monitoring-robot" WITH PASSWORD 'XXXXX'
|
||||
> CREATE USER "$savyadmin" WITH PASSWORD 'm3tr1cL0v3r'
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
##### Important notes about providing user credentials
|
||||
- The user value must be wrapped in double quotes if
|
||||
it starts with a digit, is an InfluxQL keyword, contains a hyphen,
|
||||
or includes any special characters (for example: `!@#$%^&*()-`).
|
||||
- The password [string](/influxdb/v1.8/query_language/spec/#strings) must be wrapped in single quotes.
|
||||
Do not include the single quotes when authenticating requests.
|
||||
We recommend avoiding the single quote (`'`) and backslash (`\`) characters in passwords.
|
||||
For passwords that include these characters, escape the special character with a backslash
|
||||
(e.g. (`\'`) when creating the password and when submitting authentication requests.
|
||||
- Repeating the exact `CREATE USER` statement is idempotent.
|
||||
If any values change the database will return a duplicate user error.
|
||||
|
||||
###### CLI example
|
||||
```sql
|
||||
> CREATE USER "todd" WITH PASSWORD '123456'
|
||||
> CREATE USER "todd" WITH PASSWORD '123456'
|
||||
> CREATE USER "todd" WITH PASSWORD '123'
|
||||
ERR: user already exists
|
||||
> CREATE USER "todd" WITH PASSWORD '123456'
|
||||
> CREATE USER "todd" WITH PASSWORD '123456' WITH ALL PRIVILEGES
|
||||
ERR: user already exists
|
||||
> CREATE USER "todd" WITH PASSWORD '123456'
|
||||
>
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
##### `GRANT` `READ`, `WRITE` or `ALL` database privileges to an existing user
|
||||
|
||||
```sql
|
||||
GRANT [READ,WRITE,ALL] ON <database_name> TO <username>
|
||||
```
|
||||
|
||||
CLI examples:
|
||||
|
||||
`GRANT` `READ` access to `todd` on the `NOAA_water_database` database:
|
||||
|
||||
```sql
|
||||
> GRANT READ ON "NOAA_water_database" TO "todd"
|
||||
```
|
||||
|
||||
`GRANT` `ALL` access to `todd` on the `NOAA_water_database` database:
|
||||
|
||||
```sql
|
||||
> GRANT ALL ON "NOAA_water_database" TO "todd"
|
||||
```
|
||||
|
||||
##### `REVOKE` `READ`, `WRITE`, or `ALL` database privileges from an existing user
|
||||
|
||||
```
|
||||
REVOKE [READ,WRITE,ALL] ON <database_name> FROM <username>
|
||||
```
|
||||
|
||||
CLI examples:
|
||||
|
||||
`REVOKE` `ALL` privileges from `todd` on the `NOAA_water_database` database:
|
||||
|
||||
```sql
|
||||
> REVOKE ALL ON "NOAA_water_database" FROM "todd"
|
||||
```
|
||||
|
||||
`REVOKE` `WRITE` privileges from `todd` on the `NOAA_water_database` database:
|
||||
|
||||
```sql
|
||||
> REVOKE WRITE ON "NOAA_water_database" FROM "todd"
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
If a user with `ALL` privileges has `WRITE` privileges revoked, they are left with `READ` privileges, and vice versa.
|
||||
{{% /note %}}
|
||||
|
||||
##### `SHOW` a user's database privileges
|
||||
|
||||
```sql
|
||||
SHOW GRANTS FOR <user_name>
|
||||
```
|
||||
|
||||
CLI example:
|
||||
|
||||
```sql
|
||||
> SHOW GRANTS FOR "todd"
|
||||
database privilege
|
||||
NOAA_water_database WRITE
|
||||
another_database_name READ
|
||||
yet_another_database_name ALL PRIVILEGES
|
||||
one_more_database_name NO PRIVILEGES
|
||||
```
|
||||
|
||||
#### Manage admin and non-admin users
|
||||
|
||||
##### Reset a user's password
|
||||
|
||||
```sql
|
||||
SET PASSWORD FOR <username> = '<password>'
|
||||
```
|
||||
|
||||
CLI example:
|
||||
|
||||
```sql
|
||||
> SET PASSWORD FOR "todd" = 'password4todd'
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
The password [string](/influxdb/v1.8/query_language/spec/#strings) must be wrapped in single quotes.
|
||||
Do not include the single quotes when authenticating requests.
|
||||
|
||||
We recommend avoiding the single quote (`'`) and backslash (`\`) characters in passwords
|
||||
For passwords that include these characters, escape the special character with a backslash (e.g. (`\'`) when creating the password and when submitting authentication requests.
|
||||
{{% /note %}}
|
||||
|
||||
##### `DROP` a user
|
||||
|
||||
```sql
|
||||
DROP USER <username>
|
||||
```
|
||||
|
||||
CLI example:
|
||||
|
||||
```sql
|
||||
> DROP USER "todd"
|
||||
```
|
||||
|
|
@ -0,0 +1,659 @@
|
|||
---
|
||||
title: Manage fine-grained authorization
|
||||
description: >
|
||||
Fine-grained authorization (FGA) in InfluxDB Enterprise controls user access at the database, measurement, and series levels.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Manage users and permissions
|
||||
weight: 44
|
||||
aliases:
|
||||
- /docs/v1.5/administration/fga
|
||||
- /enterprise_influxdb/v1.10/guides/fine-grained-authorization/
|
||||
related:
|
||||
- /enterprise_influxdb/v1.10/administration/authentication_and_authorization/
|
||||
- /{{< latest "chronograf" >}}/administration/managing-influxdb-users/
|
||||
---
|
||||
|
||||
{{% enterprise-warning-authn-b4-authz %}}
|
||||
|
||||
Use fine-grained authorization (FGA) to control user access at the database, measurement, and series levels.
|
||||
|
||||
You must have [admin permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#admin) to set up FGA.
|
||||
|
||||
{{% warn %}}
|
||||
#### FGA does not apply to Flux
|
||||
FGA does not restrict actions performed by Flux queries (both read and write).
|
||||
If using FGA, we recommend [disabling Flux](/enterprise_influxdb/v{{< current-version >}}/flux/installation/).
|
||||
{{% /warn %}}
|
||||
|
||||
{{% note %}}
|
||||
FGA is only available in InfluxDB Enterprise.
|
||||
InfluxDB OSS 1.x controls access at the database level only.
|
||||
{{% /note %}}
|
||||
|
||||
## Set up fine-grained authorization
|
||||
|
||||
1. [Enable authentication](/enterprise_influxdb/v1.10/administration/configure/security/authentication/) in your InfluxDB configuration file.
|
||||
|
||||
2. Create users through the InfluxDB query API.
|
||||
|
||||
```sql
|
||||
CREATE USER username WITH PASSWORD 'password'
|
||||
```
|
||||
|
||||
For more information, see [User management commands](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-influxql/#user-management-commands).
|
||||
|
||||
3. Ensure that you can access the **meta node** API (port 8091 by default).
|
||||
|
||||
{{% note %}}
|
||||
In a typical cluster configuration, the HTTP ports for data nodes
|
||||
(8086 by default) are exposed to clients but the meta node HTTP ports are not.
|
||||
You may need to work with your network administrator to gain access to the meta node HTTP ports.
|
||||
{{% /note %}}
|
||||
|
||||
4. Create users. Do the following:
|
||||
1. As Administrator, create users and grant users all permissions. The example below grants users `east` and `west` all permissions on the `datacenters` database.
|
||||
|
||||
```sql
|
||||
CREATE DATABASE datacenters
|
||||
|
||||
CREATE USER east WITH PASSWORD 'east'
|
||||
GRANT ALL ON datacenters TO east
|
||||
|
||||
CREATE USER west WITH PASSWORD 'west'
|
||||
GRANT ALL ON datacenters TO west
|
||||
```
|
||||
|
||||
2. Add fine-grained permissions to users as needed.
|
||||
|
||||
5. [Create roles](#manage-roles) to grant permissions to users assigned to a role.
|
||||
|
||||
{{% note %}}
|
||||
For an overview of how users and roles work in InfluxDB Enterprise, see [InfluxDB Enterprise users](/enterprise_influxdb/v1.10/features/users/).
|
||||
{{% /note %}}
|
||||
|
||||
6. [Set up restrictions](#manage-restrictions).
|
||||
Restrictions apply to all non-admin users.
|
||||
|
||||
{{% note %}}
|
||||
Permissions (currently "read" and "write") may be restricted independently depending on the scenario.
|
||||
{{% /note %}}
|
||||
|
||||
7. [Set up grants](#manage-grants) to remove restrictions for specified users and roles.
|
||||
|
||||
---
|
||||
|
||||
{{% note %}}
|
||||
#### Notes about examples
|
||||
The examples below use `curl`, a command line tool for transferring data, to send
|
||||
HTTP requests to the Meta API, and [`jq`](https://stedolan.github.io/jq/), a command line JSON processor,
|
||||
to make the JSON output easier to read.
|
||||
Alternatives for each are available, but are not covered in this documentation.
|
||||
|
||||
All examples assume authentication is enabled in InfluxDB.
|
||||
Admin credentials must be sent with each request.
|
||||
Use the `curl -u` flag to pass authentication credentials:
|
||||
|
||||
```sh
|
||||
curl -u `username:password` #...
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
---
|
||||
|
||||
## Matching methods
|
||||
The following matching methods are available when managing restrictions and grants to databases, measurements, or series:
|
||||
|
||||
- `exact` (matches only exact string matches)
|
||||
- `prefix` (matches strings the begin with a specified prefix)
|
||||
|
||||
```sh
|
||||
# Match a database name exactly
|
||||
"database": {"match": "exact", "value": "my_database"}
|
||||
|
||||
# Match any databases that begin with "my_"
|
||||
"database": {"match": "prefix", "value": "my_"}
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
#### Wildcard matching
|
||||
Neither `exact` nor `prefix` matching methods allow for wildcard matching.
|
||||
{{% /note %}}
|
||||
|
||||
## Manage roles
|
||||
Roles allow you to assign permissions to groups of users.
|
||||
The following examples assume the `user1`, `user2` and `ops` users already exist in InfluxDB.
|
||||
|
||||
### Create a role
|
||||
To create a new role, use the InfluxDB Meta API `/role` endpoint with the `action`
|
||||
field set to `create` in the request body.
|
||||
|
||||
The following examples create two new roles:
|
||||
|
||||
- east
|
||||
- west
|
||||
|
||||
```sh
|
||||
# Create east role
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "create",
|
||||
"role": {
|
||||
"name": "east"
|
||||
}
|
||||
}'
|
||||
|
||||
# Create west role
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "create",
|
||||
"role": {
|
||||
"name": "west"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Specify role permissions
|
||||
To specify permissions for a role,
|
||||
use the InfluxDB Meta API `/role` endpoint with the `action` field set to `add-permissions`.
|
||||
Specify the [permissions](/chronograf/v1.8/administration/managing-influxdb-users/#permissions) to add for each database.
|
||||
|
||||
The following example sets read and write permissions on `db1` for both `east` and `west` roles.
|
||||
|
||||
```sh
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-permissions",
|
||||
"role": {
|
||||
"name": "east",
|
||||
"permissions": {
|
||||
"db1": ["ReadData", "WriteData"]
|
||||
}
|
||||
}
|
||||
}'
|
||||
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-permissions",
|
||||
"role": {
|
||||
"name": "west",
|
||||
"permissions": {
|
||||
"db1": ["ReadData", "WriteData"]
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Remove role permissions
|
||||
To remove permissions from a role, use the InfluxDB Meta API `/role` endpoint with the `action` field
|
||||
set to `remove-permissions`.
|
||||
Specify the [permissions](/{{< latest "chronograf" >}}/administration/managing-influxdb-users/#permissions) to remove from each database.
|
||||
|
||||
The following example removes read and write permissions from `db1` for the `east` role.
|
||||
|
||||
```sh
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "remove-permissions",
|
||||
"role": {
|
||||
"name": "east",
|
||||
"permissions": {
|
||||
"db1": ["ReadData", "WriteData"]
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Assign users to a role
|
||||
To assign users to role, set the `action` field to `add-users` and include a list
|
||||
of users in the `role` field.
|
||||
|
||||
The following examples add user1, user2 and the ops user to the `east` and `west` roles.
|
||||
|
||||
```sh
|
||||
# Add user1 and ops to the east role
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-users",
|
||||
"role": {
|
||||
"name": "east",
|
||||
"users": ["user1", "ops"]
|
||||
}
|
||||
}'
|
||||
|
||||
# Add user1 and ops to the west role
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-users",
|
||||
"role": {
|
||||
"name": "west",
|
||||
"users": ["user2", "ops"]
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### View existing roles
|
||||
To view existing roles with their assigned permissions and users, use the `GET`
|
||||
request method with the InfluxDB Meta API `/role` endpoint.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -XGET http://localhost:8091/role | jq
|
||||
```
|
||||
|
||||
### Delete a role
|
||||
To delete a role, the InfluxDB Meta API `/role` endpoint and set the `action`
|
||||
field to `delete` and include the name of the role to delete.
|
||||
|
||||
```sh
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "delete",
|
||||
"role": {
|
||||
"name": "west"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Deleting a role does not delete users assigned to the role.
|
||||
{{% /note %}}
|
||||
|
||||
## Manage restrictions
|
||||
Restrictions restrict either or both read and write permissions on InfluxDB assets.
|
||||
Restrictions apply to all non-admin users.
|
||||
[Grants](#manage-grants) override restrictions.
|
||||
|
||||
> In order to run meta queries (such as `SHOW MEASUREMENTS` or `SHOW TAGS` ),
|
||||
> users must have read permissions for the database and retention policy they are querying.
|
||||
|
||||
Manage restrictions using the InfluxDB Meta API `acl/restrictions` endpoint.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -XGET "http://localhost:8091/influxdb/v2/acl/restrictions"
|
||||
```
|
||||
|
||||
- [Restrict by database](#restrict-by-database)
|
||||
- [Restrict by measurement in a database](#restrict-by-measurement-in-a-database)
|
||||
- [Restrict by series in a database](#restrict-by-series-in-a-database)
|
||||
- [View existing restrictions](#view-existing-restrictions)
|
||||
- [Update a restriction](#update-a-restriction)
|
||||
- [Remove a restriction](#remove-a-restriction)
|
||||
|
||||
> **Note:** For the best performance, set up minimal restrictions.
|
||||
|
||||
### Restrict by database
|
||||
In most cases, restricting the database is the simplest option, and has minimal impact on performance.
|
||||
The following example restricts reads and writes on the `my_database` database.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
### Restrict by measurement in a database
|
||||
The following example restricts read and write permissions on the `network`
|
||||
measurement in the `my_database` database.
|
||||
_This restriction does not apply to other measurements in the `my_database` database._
|
||||
|
||||
```sh
|
||||
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
### Restrict by series in a database
|
||||
The most fine-grained restriction option is to restrict specific tags in a measurement and database.
|
||||
The following example restricts read and write permissions on the `datacenter=east` tag in the
|
||||
`network` measurement in the `my_database` database.
|
||||
_This restriction does not apply to other tags or tag values in the `network` measurement._
|
||||
|
||||
```sh
|
||||
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
_Consider this option carefully, as it allows writes to `network` without tags or
|
||||
writes to `network` with a tag key of `datacenter` and a tag value of anything but `east`._
|
||||
|
||||
##### Apply restrictions to a series defined by multiple tags
|
||||
```sh
|
||||
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [
|
||||
{"match": "exact", "key": "tag1", "value": "value1"},
|
||||
{"match": "exact", "key": "tag2", "value": "value2"}
|
||||
],
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
#### Create multiple restrictions at a time
|
||||
There may be times where you need to create restrictions using unique values for each.
|
||||
To create multiple restrictions for a list of values, use a bash `for` loop:
|
||||
|
||||
```sh
|
||||
for value in val1 val2 val3 val4; do
|
||||
curl --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "'$value'"}],
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
done
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
### View existing restrictions
|
||||
To view existing restrictions, use the `GET` request method with the `acl/restrictions` endpoint.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -u "admin-username:admin-password" -XGET "http://localhost:8091/influxdb/v2/acl/restrictions" | jq
|
||||
```
|
||||
|
||||
### Update a restriction
|
||||
_You can not directly modify a restriction.
|
||||
Delete the existing restriction and create a new one with updated parameters._
|
||||
|
||||
### Remove a restriction
|
||||
To remove a restriction, obtain the restriction ID using the `GET` request method
|
||||
with the `acl/restrictions` endpoint.
|
||||
Use the `DELETE` request method to delete a restriction by ID.
|
||||
|
||||
```sh
|
||||
# Obtain the restriction ID from the list of restrictions
|
||||
curl --location-trusted -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/restrictions" | jq
|
||||
|
||||
# Delete the restriction using the restriction ID
|
||||
curl --location-trusted -u "admin-username:admin-password" \
|
||||
-XDELETE "http://localhost:8091/influxdb/v2/acl/restrictions/<restriction_id>"
|
||||
```
|
||||
|
||||
## Manage grants
|
||||
Grants remove restrictions and grant users or roles either or both read and write
|
||||
permissions on InfluxDB assets.
|
||||
|
||||
Manage grants using the InfluxDB Meta API `acl/grants` endpoint.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/grants"
|
||||
```
|
||||
|
||||
- [Grant permissions by database](#grant-permissions-by-database)
|
||||
- [Grant permissions by measurement in a database](#grant-permissions-by-measurement-in-a-database)
|
||||
- [Grant permissions by series in a database](#grant-permissions-by-series-in-a-database)
|
||||
- [View existing grants](#view-existing-grants)
|
||||
- [Update a grant](#update-a-grant)
|
||||
- [Remove a grant](#remove-a-grant)
|
||||
|
||||
### Grant permissions by database
|
||||
The following examples grant read and write permissions on the `my_database` database.
|
||||
|
||||
> **Note:** This offers no guarantee that the users will write to the correct measurement or use the correct tags.
|
||||
|
||||
##### Grant database-level permissions to users
|
||||
```sh
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"permissions": ["read", "write"],
|
||||
"users": [
|
||||
{"name": "user1"},
|
||||
{"name": "user2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
##### Grant database-level permissions to roles
|
||||
```sh
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [
|
||||
{"name": "role1"},
|
||||
{"name": "role2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
### Grant permissions by measurement in a database
|
||||
The following examples grant permissions to the `network` measurement in the `my_database` database.
|
||||
These grants do not apply to other measurements in the `my_database` database nor
|
||||
guarantee that users will use the correct tags.
|
||||
|
||||
##### Grant measurement-level permissions to users
|
||||
```sh
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"permissions": ["read", "write"],
|
||||
"users": [
|
||||
{"name": "user1"},
|
||||
{"name": "user2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
To grant access for roles, run:
|
||||
|
||||
```sh
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [
|
||||
{"name": "role1"},
|
||||
{"name": "role2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
### Grant permissions by series in a database
|
||||
|
||||
The following examples grant access only to data with the corresponding `datacenter` tag.
|
||||
_Neither guarantees the users will use the `network` measurement._
|
||||
|
||||
##### Grant series-level permissions to users
|
||||
```sh
|
||||
# Grant user1 read/write permissions on data with the 'datacenter=east' tag set.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user1"}]
|
||||
}'
|
||||
|
||||
# Grant user2 read/write permissions on data with the 'datacenter=west' tag set.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
##### Grant series-level permissions to roles
|
||||
```sh
|
||||
# Grant role1 read/write permissions on data with the 'datacenter=east' tag set.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role1"}]
|
||||
}'
|
||||
|
||||
# Grant role2 read/write permissions on data with the 'datacenter=west' tag set.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
### Grant access to specific series in a measurement
|
||||
The following examples grant read and write permissions to corresponding `datacenter`
|
||||
tags in the `network` measurement.
|
||||
_They each specify the measurement in the request body._
|
||||
|
||||
##### Grant series-level permissions in a measurement to users
|
||||
```sh
|
||||
# Grant user1 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user1"}]
|
||||
}'
|
||||
|
||||
# Grant user2 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
##### Grant series-level permissions in a measurement to roles
|
||||
```sh
|
||||
# Grant role1 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role1"}]
|
||||
}'
|
||||
|
||||
# Grant role2 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s --location-trusted -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
Grants for specific series also apply to [meta queries](/enterprise_influxdb/v1.10/query_language/schema_exploration).
|
||||
Results from meta queries are restricted based on series-level permissions.
|
||||
For example, `SHOW TAG VALUES` only returns tag values that the user is authorized to see.
|
||||
|
||||
With these grants in place, a user or role can only read or write data from or to
|
||||
the `network` measurement if the data includes the appropriate `datacenter` tag set.
|
||||
|
||||
{{% note %}}
|
||||
Note that this is only the requirement of the presence of that tag;
|
||||
`datacenter=east,foo=bar` will also be accepted.
|
||||
{{% /note %}}
|
||||
|
||||
### View existing grants
|
||||
To view existing grants, use the `GET` request method with the `acl/grants` endpoint.
|
||||
|
||||
```sh
|
||||
curl --location-trusted -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/grants" | jq
|
||||
```
|
||||
|
||||
### Update a grant
|
||||
_You can not directly modify a grant.
|
||||
Delete the existing grant and create a new one with updated parameters._
|
||||
|
||||
### Remove a grant
|
||||
To delete a grant, obtain the grant ID using the `GET` request method with the
|
||||
`acl/grants` endpoint.
|
||||
Use the `DELETE` request method to delete a grant by ID.
|
||||
|
||||
```sh
|
||||
# Obtain the grant ID from the list of grants
|
||||
curl --location-trusted -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/grants" | jq
|
||||
|
||||
# Delete the grant using the grant ID
|
||||
curl --location-trusted -u "admin-username:admin-password" \
|
||||
-XDELETE "http://localhost:8091/influxdb/v2/acl/grants/<grant_id>"
|
||||
```
|
|
@ -0,0 +1,84 @@
|
|||
---
|
||||
title: Introduction to authorization in InfluxDB Enterprise
|
||||
description: >
|
||||
Learn the basics of managing users and permissions in InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Introduction to authorization
|
||||
parent: Manage users and permissions
|
||||
weight: 30
|
||||
related:
|
||||
- /enterprise_influxdb/v1.10/guides/fine-grained-authorization/
|
||||
- /{{< latest "chronograf" >}}/administration/managing-influxdb-users/
|
||||
---
|
||||
|
||||
Authorization in InfluxDB Enterprise refers to managing user permissions.
|
||||
To secure and manage access to an InfluxDB Enterprise cluster,
|
||||
first [configure authentication](/enterprise_influxdb/v1.10/administration/configure/security/authentication/).
|
||||
You can then manage users and permissions as necessary.
|
||||
|
||||
This page is meant to help new users choose the best method
|
||||
for managing permissions in InfluxDB Enterprise.
|
||||
|
||||
## Permissions in InfluxDB Enterprise
|
||||
|
||||
InfluxDB Enterprise has an [expanded set of 16 permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
|
||||
These permissions allow for
|
||||
controlling read and write access to data for all databases and for individual databases,
|
||||
as well as permitting certain cluster-management actions like creating or deleting resources.
|
||||
|
||||
InfluxDB 1.x OSS only supports database-level privileges: `READ` and `WRITE`.
|
||||
A third permission, `ALL`, grants admin privileges.
|
||||
These three permissions exist in InfluxDB Enterprise as well.
|
||||
They can _only be granted by using InfluxQL_.
|
||||
|
||||
## Manage user authorization
|
||||
|
||||
Choose one of the following methods manage authorizations in InfluxDB Enterprise:
|
||||
|
||||
- using [InfluxQL](#manage-read-and-write-privileges-with-influxql)
|
||||
{{% note %}}
|
||||
InfluxQL can can only grant `READ`, `WRITE`, and `ALL PRIVILEGES` privileges.
|
||||
To use the full set of InfluxDB Enterprise [permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/),
|
||||
use [Chronograf](#manage-specific-privileges-with-chronograf)
|
||||
or the [Meta API (recommended)](#influxdb-enterprise-meta-api).
|
||||
{{% /note %}}
|
||||
- using [Chronograf](#manage-enterprise-permissions-with-chronograf)
|
||||
- using the [InfluxDB Enterprise meta API](#manage-enterprise-permissions-with-the-meta-api) (**Recommended**)
|
||||
|
||||
### Manage read and write privileges with InfluxQL
|
||||
|
||||
If you only need to manage basic `READ`, `WRITE`, and `ALL` privileges,
|
||||
use InfluxQL to manage authorizations.
|
||||
(For instance, if you upgraded from InfluxDB OSS 1.x
|
||||
and do not need the more detailed authorization in InfluxDB Enterprise, continue to use InfluxQL.)
|
||||
|
||||
{{% warn %}}
|
||||
We recommend operators *do not* mix and match InfluxQL
|
||||
with other authorization management methods (Chronograf and the API).
|
||||
Doing so may lead to inconsistencies in user permissions.
|
||||
{{% /warn %}}
|
||||
|
||||
### Manage Enterprise permissions with Chronograf
|
||||
|
||||
The Chronograf user interface can manage the
|
||||
[full set of InfluxDB Enterprise permissions](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/permissions/#permissions).
|
||||
|
||||
The permissions listed in Chronograf are global for the cluster, and available through the API.
|
||||
Outside of [FGA](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/fine-grained-authorization),
|
||||
the only database-level permissions available are the basic `READ` and `WRITE`.
|
||||
These can only be managed using [InfluxQL](#manage-read-and-write-privileges-with-influxql).
|
||||
|
||||
Chronograf can only set permissions globally, for all databases, within a cluster.
|
||||
If you need to set permissions at the database level, use the [Meta API](#influxdb-enterprise-meta-api).
|
||||
|
||||
See ["Manage InfluxDB users in Chronograf"](/chronograf/v1.10/administration/managing-influxdb-users/)
|
||||
for instructions.
|
||||
|
||||
### Manage Enterprise permissions with the Meta API
|
||||
|
||||
The InfluxDB Enterprise API is the recommended method for managing permissions.
|
||||
Use the API to manage setting cluster-wide and database-specific permissions.
|
||||
|
||||
For more information on using the meta API,
|
||||
see [here](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api).
|
|
@ -0,0 +1,154 @@
|
|||
---
|
||||
title: Enterprise users and permissions reference
|
||||
description: >
|
||||
Detailed reference for users, roles, permissions, and permission-to-statement mappings.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Manage users and permissions
|
||||
weight: 100
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/features/users/
|
||||
---
|
||||
|
||||
{{% enterprise-warning-authn-b4-authz %}}
|
||||
|
||||
- [Users](#users)
|
||||
- [Permissions](#permissions)
|
||||
|
||||
## Users
|
||||
|
||||
Users have permissions and roles.
|
||||
|
||||
### Roles
|
||||
|
||||
Roles are groups of permissions.
|
||||
A single role can belong to several users.
|
||||
|
||||
InfluxDB Enterprise clusters have two built-in roles:
|
||||
|
||||
#### Global Admin
|
||||
|
||||
The Global Admin role has all 16 [cluster permissions](#permissions).
|
||||
|
||||
#### Admin
|
||||
|
||||
The Admin role has all [cluster permissions](#permissions) except for the
|
||||
permissions to:
|
||||
|
||||
* Add/Remove Nodes
|
||||
* Copy Shard
|
||||
* Manage Shards
|
||||
* Rebalance
|
||||
|
||||
## Permissions
|
||||
|
||||
A **permission** (also *privilege*) is the ability to access a resource in some way, including:
|
||||
- viewing the resource
|
||||
- copying the resource
|
||||
- dropping the resource
|
||||
- writing to the resource
|
||||
- full management capabilities
|
||||
|
||||
InfluxDB Enterprise clusters have 16 permissions:
|
||||
|
||||
| Permission | Description | Token |
|
||||
|:--------------------------|---------------------------------------------------------|------------------------|
|
||||
| View Admin | Permission to view or edit admin screens | `ViewAdmin` |
|
||||
| View Chronograf | Permission to use Chronograf tools | `ViewChronograf` |
|
||||
| Create Databases | Permission to create databases | `CreateDatabase` |
|
||||
| Create Users & Roles | Permission to create users and roles | `CreateUserAndRole` |
|
||||
| Add/Remove Nodes | Permission to add/remove nodes from a cluster | `AddRemoveNode` |
|
||||
| Drop Databases | Permission to drop databases | `DropDatabase` |
|
||||
| Drop Data | Permission to drop measurements and series | `DropData` |
|
||||
| Read | Permission to read data | `ReadData` |
|
||||
| Write | Permission to write data | `WriteData` |
|
||||
| Rebalance | Permission to rebalance a cluster | `Rebalance` |
|
||||
| Manage Shards | Permission to copy and delete shards | `ManageShard` |
|
||||
| Manage Continuous Queries | Permission to create, show, and drop continuous queries | `ManageContnuousQuery` |
|
||||
| Manage Queries | Permission to show and kill queries | `ManageQuery` |
|
||||
| Manage Subscriptions | Permission to show, add, and drop subscriptions | `ManageSubscription` |
|
||||
| Monitor | Permission to show stats and diagnostics | `Monitor` |
|
||||
| Copy Shard | Permission to copy shards | `CopyShard` |
|
||||
|
||||
In addition, two tokens govern Kapacitor permissions:
|
||||
|
||||
* `KapacitorAPI`:
|
||||
Grants the user permission to create, read, update and delete
|
||||
tasks, topics, handlers and similar Kapacitor artifacts.
|
||||
* `KapacitorConfigAPI`:
|
||||
Grants the user permission to override the Kapacitor configuration
|
||||
dynamically using the configuration endpoint.
|
||||
|
||||
### Permissions scope
|
||||
|
||||
Using the InfluxDB Enterprise Meta API,
|
||||
these permissions can be set at the cluster-wide level (for all databases at once)
|
||||
and for specific databases.
|
||||
For examples, see [Manage authorization with the InfluxDB Enterprise Meta API](/enterprise_influxdb/v1.10/administration/manage/users-and-permissions/authorization-api/).
|
||||
|
||||
### Permission to Statement
|
||||
|
||||
The following table describes permissions required to execute the associated database statement.
|
||||
<!-- It also describes whether these permissions apply just to InfluxDB (Database) or InfluxDB Enterprise (Cluster). -->
|
||||
|
||||
| Permission | Statement |
|
||||
|----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| CreateDatabasePermission | AlterRetentionPolicyStatement, CreateDatabaseStatement, CreateRetentionPolicyStatement, ShowRetentionPoliciesStatement |
|
||||
| ManageContinuousQueryPermission | CreateContinuousQueryStatement, DropContinuousQueryStatement, ShowContinuousQueriesStatement |
|
||||
| ManageSubscriptionPermission | CreateSubscriptionStatement, DropSubscriptionStatement, ShowSubscriptionsStatement |
|
||||
| CreateUserAndRolePermission | CreateUserStatement, DropUserStatement, GrantAdminStatement, GrantStatement, RevokeAdminStatement, RevokeStatement, SetPasswordUserStatement, ShowGrantsForUserStatement, ShowUsersStatement |
|
||||
| DropDataPermission | DeleteSeriesStatement, DeleteStatement, DropMeasurementStatement, DropSeriesStatement |
|
||||
| DropDatabasePermission | DropDatabaseStatement, DropRetentionPolicyStatement |
|
||||
| ManageShardPermission | DropShardStatement,ShowShardGroupsStatement, ShowShardsStatement |
|
||||
| ManageQueryPermission | KillQueryStatement, ShowQueriesStatement |
|
||||
| MonitorPermission | ShowDiagnosticsStatement, ShowStatsStatement |
|
||||
| ReadDataPermission | ShowFieldKeysStatement, ShowMeasurementsStatement, ShowSeriesStatement, ShowTagKeysStatement, ShowTagValuesStatement, ShowRetentionPoliciesStatement |
|
||||
| NoPermissions | ShowDatabasesStatement |
|
||||
| Determined by type of select statement | SelectStatement |
|
||||
|
||||
### Statement to Permission
|
||||
|
||||
The following table describes database statements and the permissions required to execute them.
|
||||
It also describes whether these permissions apply the the database or cluster level.
|
||||
|
||||
| Statement | Permissions | Scope | |
|
||||
|--------------------------------|----------------------------------------|----------|--------------------------------------------------------------------------|
|
||||
| AlterRetentionPolicyStatement | CreateDatabasePermission | Database | |
|
||||
| CreateContinuousQueryStatement | ManageContinuousQueryPermission | Database | |
|
||||
| CreateDatabaseStatement | CreateDatabasePermission | Cluster | |
|
||||
| CreateRetentionPolicyStatement | CreateDatabasePermission | Database | |
|
||||
| CreateSubscriptionStatement | ManageSubscriptionPermission | Database | |
|
||||
| CreateUserStatement | CreateUserAndRolePermission | Database | |
|
||||
| DeleteSeriesStatement | DropDataPermission | Database | |
|
||||
| DeleteStatement | DropDataPermission | Database | |
|
||||
| DropContinuousQueryStatement | ManageContinuousQueryPermission | Database | |
|
||||
| DropDatabaseStatement | DropDatabasePermission | Cluster | |
|
||||
| DropMeasurementStatement | DropDataPermission | Database | |
|
||||
| DropRetentionPolicyStatement | DropDatabasePermission | Database | |
|
||||
| DropSeriesStatement | DropDataPermission | Database | |
|
||||
| DropShardStatement | ManageShardPermission | Cluster | |
|
||||
| DropSubscriptionStatement | ManageSubscriptionPermission | Database | |
|
||||
| DropUserStatement | CreateUserAndRolePermission | Database | |
|
||||
| GrantAdminStatement | CreateUserAndRolePermission | Database | |
|
||||
| GrantStatement | CreateUserAndRolePermission | Database | |
|
||||
| KillQueryStatement | ManageQueryPermission | Database | |
|
||||
| RevokeAdminStatement | CreateUserAndRolePermission | Database | |
|
||||
| RevokeStatement | CreateUserAndRolePermission | Database | |
|
||||
| SelectStatement | Determined by type of select statement | n/a | |
|
||||
| SetPasswordUserStatement | CreateUserAndRolePermission | Database | |
|
||||
| ShowContinuousQueriesStatement | ManageContinuousQueryPermission | Database | |
|
||||
| ShowDatabasesStatement | NoPermissions | Cluster | The user's grants determine which databases are returned in the results. |
|
||||
| ShowDiagnosticsStatement | MonitorPermission | Database | |
|
||||
| ShowFieldKeysStatement | ReadDataPermission | Database | |
|
||||
| ShowGrantsForUserStatement | CreateUserAndRolePermission | Database | |
|
||||
| ShowMeasurementsStatement | ReadDataPermission | Database | |
|
||||
| ShowQueriesStatement | ManageQueryPermission | Database | |
|
||||
| ShowRetentionPoliciesStatement | CreateDatabasePermission | Database | |
|
||||
| ShowSeriesStatement | ReadDataPermission | Database | |
|
||||
| ShowShardGroupsStatement | ManageShardPermission | Cluster | |
|
||||
| ShowShardsStatement | ManageShardPermission | Cluster | |
|
||||
| ShowStatsStatement | MonitorPermission | Database | |
|
||||
| ShowSubscriptionsStatement | ManageSubscriptionPermission | Database | |
|
||||
| ShowTagKeysStatement | ReadDataPermission | Database | |
|
||||
| ShowTagValuesStatement | ReadDataPermission | Database | |
|
||||
| ShowUsersStatement | CreateUserAndRolePermission | Database | |
|
|
@ -0,0 +1,30 @@
|
|||
---
|
||||
title: Monitor InfluxDB Enterprise
|
||||
description: Monitor InfluxDB Enterprise with InfluxDB Cloud or OSS.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Monitor
|
||||
parent: Administration
|
||||
weight: 50
|
||||
---
|
||||
|
||||
Monitoring is the act of observing changes in data over time.
|
||||
There are multiple ways to monitor your InfluxDB Enterprise cluster.
|
||||
See the guides below to monitor a cluster using another InfluxDB instance.
|
||||
|
||||
Alternatively, to view your output data occasionally (_e.g._, for auditing or diagnostics),
|
||||
do one of the following:
|
||||
|
||||
- [Log and trace InfluxDB Enterprise operations](/enterprise_influxdb/v1.10/administration/monitor/logs/)
|
||||
- [Use InfluxQL for diagnostics](/enterprise_influxdb/v1.10/administration/monitor/diagnostics/)
|
||||
|
||||
{{% note %}}
|
||||
### Monitor with InfluxDB Aware and Influx Insights
|
||||
InfluxDB Aware and Influx Insights is a free Enterprise service that sends your data to a free Cloud account.
|
||||
Aware assists you in monitoring your data by yourself.
|
||||
Insights assists you in monitoring your data with the help of the support team.
|
||||
|
||||
To apply for this service, please contact the [support team](https://support.influxdata.com/s/login/).
|
||||
{{% /note %}}
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,27 @@
|
|||
---
|
||||
title: Use InfluxQL for diagnostics
|
||||
description: Use InfluxQL commands for diagnostics and statistics.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Diagnostics
|
||||
parent: Monitor
|
||||
weight: 104
|
||||
---
|
||||
|
||||
The commands below are useful when diagnosing issues with InfluxDB Enterprise clusters.
|
||||
Use the [`influx` CLI](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) to run these commands.
|
||||
|
||||
### SHOW STATS
|
||||
|
||||
To see node statistics, run `SHOW STATS`.
|
||||
The statistics returned by `SHOW STATS` are stored in memory only,
|
||||
and are reset to zero when the node is restarted.
|
||||
|
||||
For details on this command, see [`SHOW STATS`](/enterprise_influxdb/v1.10/query_language/spec#show-stats).
|
||||
|
||||
### SHOW DIAGNOSTICS
|
||||
|
||||
To see node diagnostic information, run `SHOW DIAGNOSTICS`.
|
||||
This returns information such as build information, uptime, hostname, server configuration, memory usage, and Go runtime diagnostics.
|
||||
|
||||
For details on this command, see [`SHOW DIAGNOSTICS`](/enterprise_influxdb/v1.10/query_language/spec#show-diagnostics).
|
|
@ -0,0 +1,308 @@
|
|||
---
|
||||
title: Log and trace InfluxDB Enterprise operations
|
||||
description: >
|
||||
Learn about logging locations, redirecting HTTP request logging, structured logging, and tracing.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Log and trace
|
||||
parent: Monitor
|
||||
weight: 103
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/logs/
|
||||
---
|
||||
|
||||
|
||||
* [Logging locations](#logging-locations)
|
||||
* [Redirect HTTP request logging](#redirect-http-access-logging)
|
||||
* [Structured logging](#structured-logging)
|
||||
* [Tracing](#tracing)
|
||||
|
||||
|
||||
InfluxDB writes log output, by default, to `stderr`.
|
||||
Depending on your use case, this log information can be written to another location.
|
||||
Some service managers may override this default.
|
||||
|
||||
## Logging locations
|
||||
|
||||
### Run InfluxDB directly
|
||||
|
||||
If you run InfluxDB directly, using `influxd`, all logs will be written to `stderr`.
|
||||
You may redirect this log output as you would any output to `stderr` like so:
|
||||
|
||||
```bash
|
||||
influxdb-meta 2>$HOME/my_log_file # Meta nodes
|
||||
influxd 2>$HOME/my_log_file # Data nodes
|
||||
influx-enterprise 2>$HOME/my_log_file # Enterprise Web
|
||||
```
|
||||
|
||||
### Launched as a service
|
||||
|
||||
#### sysvinit
|
||||
|
||||
If InfluxDB was installed using a pre-built package, and then launched
|
||||
as a service, `stderr` is redirected to
|
||||
`/var/log/influxdb/<node-type>.log`, and all log data will be written to
|
||||
that file. You can override this location by setting the variable
|
||||
`STDERR` in the file `/etc/default/<node-type>`.
|
||||
|
||||
For example, if on a data node `/etc/default/influxdb` contains:
|
||||
|
||||
```bash
|
||||
STDERR=/dev/null
|
||||
```
|
||||
|
||||
all log data will be discarded. You can similarly direct output to
|
||||
`stdout` by setting `STDOUT` in the same file. Output to `stdout` is
|
||||
sent to `/dev/null` by default when InfluxDB is launched as a service.
|
||||
|
||||
InfluxDB must be restarted to pick up any changes to `/etc/default/<node-type>`.
|
||||
|
||||
|
||||
##### Meta nodes
|
||||
|
||||
For meta nodes, the <node-type> is `influxdb-meta`.
|
||||
The default log file is `/var/log/influxdb/influxdb-meta.log`
|
||||
The service configuration file is `/etc/default/influxdb-meta`.
|
||||
|
||||
##### Data nodes
|
||||
|
||||
For data nodes, the <node-type> is `influxdb`.
|
||||
The default log file is `/var/log/influxdb/influxdb.log`
|
||||
The service configuration file is `/etc/default/influxdb`.
|
||||
|
||||
##### Enterprise Web
|
||||
|
||||
For Enterprise Web nodes, the <node-type> is `influx-enterprise`.
|
||||
The default log file is `/var/log/influxdb/influx-enterprise.log`
|
||||
The service configuration file is `/etc/default/influx-enterprise`.
|
||||
|
||||
#### systemd
|
||||
|
||||
Starting with version 1.0, InfluxDB on systemd systems no longer
|
||||
writes files to `/var/log/<node-type>.log` by default, and now uses the
|
||||
system configured default for logging (usually `journald`). On most
|
||||
systems, the logs will be directed to the systemd journal and can be
|
||||
accessed with the command:
|
||||
|
||||
```
|
||||
sudo journalctl -u <node-type>.service
|
||||
```
|
||||
|
||||
Please consult the systemd journald documentation for configuring
|
||||
journald.
|
||||
|
||||
##### Meta nodes
|
||||
|
||||
For data nodes the <node-type> is `influxdb-meta`.
|
||||
The default log command is `sudo journalctl -u influxdb-meta.service`
|
||||
The service configuration file is `/etc/default/influxdb-meta`.
|
||||
|
||||
##### Data nodes
|
||||
|
||||
For data nodes the <node-type> is `influxdb`.
|
||||
The default log command is `sudo journalctl -u influxdb.service`
|
||||
The service configuration file is `/etc/default/influxdb`.
|
||||
|
||||
##### Enterprise Web
|
||||
|
||||
For data nodes the <node-type> is `influx-enterprise`.
|
||||
The default log command is `sudo journalctl -u influx-enterprise.service`
|
||||
The service configuration file is `/etc/default/influx-enterprise`.
|
||||
|
||||
### Use logrotate
|
||||
|
||||
You can use [logrotate](https://manpages.ubuntu.com/manpages/jammy/en/man8/logrotate.8.html)
|
||||
to rotate the log files generated by InfluxDB on systems where logs are written to flat files.
|
||||
If using the package install on a sysvinit system, the config file for logrotate is installed in `/etc/logrotate.d`.
|
||||
You can view the file [here](https://github.com/influxdb/influxdb/blob/master/scripts/logrotate).
|
||||
|
||||
## Redirect HTTP access logging
|
||||
|
||||
InfluxDB 1.5 introduces the option to log HTTP request traffic separately from the other InfluxDB log output. When HTTP request logging is enabled, the HTTP logs are intermingled by default with internal InfluxDB logging. By redirecting the HTTP request log entries to a separate file, both log files are easier to read, monitor, and debug.
|
||||
|
||||
See [Redirecting HTTP request logging](/enterprise_influxdb/v1.10/administration/logs/#redirecting-http-access-logging) in the InfluxDB OSS documentation.
|
||||
|
||||
## Structured logging
|
||||
|
||||
With InfluxDB 1.5, structured logging is supported and enable machine-readable and more developer-friendly log output formats. The two new structured log formats, `logfmt` and `json`, provide easier filtering and searching with external tools and simplifies integration of InfluxDB logs with Splunk, Papertrail, Elasticsearch, and other third party tools.
|
||||
|
||||
See [Structured logging](/enterprise_influxdb/v1.10/administration/logs/#structured-logging) in the InfluxDB OSS documentation.
|
||||
|
||||
## Tracing
|
||||
|
||||
Logging has been enhanced to provide tracing of important InfluxDB operations.
|
||||
Tracing is useful for error reporting and discovering performance bottlenecks.
|
||||
|
||||
### Logging keys used in tracing
|
||||
|
||||
#### Tracing identifier key
|
||||
|
||||
The `trace_id` key specifies a unique identifier for a specific instance of a trace.
|
||||
You can use this key to filter and correlate all related log entries for an operation.
|
||||
|
||||
All operation traces include consistent starting and ending log entries, with the same message (`msg`) describing the operation (e.g., "TSM compaction"), but adding the appropriate `op_event` context (either `start` or `end`).
|
||||
For an example, see [Finding all trace log entries for an InfluxDB operation](#finding-all-trace-log-entries-for-an-influxdb-operation).
|
||||
|
||||
**Example:** `trace_id=06R0P94G000`
|
||||
|
||||
#### Operation keys
|
||||
|
||||
The following operation keys identify an operation's name, the start and end timestamps, and the elapsed execution time.
|
||||
|
||||
##### `op_name`
|
||||
Unique identifier for an operation.
|
||||
You can filter on all operations of a specific name.
|
||||
|
||||
**Example:** `op_name=tsm1_compact_group`
|
||||
|
||||
##### `op_event`
|
||||
Specifies the start and end of an event.
|
||||
The two possible values, `(start)` or `(end)`, are used to indicate when an operation started or ended.
|
||||
For example, you can grep by values in `op_name` AND `op_event` to find all starting operation log entries.
|
||||
For an example of this, see [Finding all starting log entries](#finding-all-starting-operation-log-entries).
|
||||
|
||||
**Example:** `op_event=start`
|
||||
|
||||
##### `op_elapsed`
|
||||
Duration of the operation execution.
|
||||
Logged with the ending trace log entry.
|
||||
Valid duration units are `ns`, `µs`, `ms`, and `s`.
|
||||
|
||||
**Example:** `op_elapsed=352ms`
|
||||
|
||||
|
||||
#### Log identifier context key
|
||||
|
||||
The log identifier key (`log_id`) lets you easily identify _every_ log entry for a single execution of an `influxd` process.
|
||||
There are other ways a log file could be split by a single execution, but the consistent `log_id` eases the searching of log aggregation services.
|
||||
|
||||
**Example:** `log_id=06QknqtW000`
|
||||
|
||||
#### Database context keys
|
||||
|
||||
- **db\_instance**: Database name
|
||||
- **db\_rp**: Retention policy name
|
||||
- **db\_shard\_id**: Shard identifier
|
||||
- **db\_shard\_group**: Shard group identifier
|
||||
|
||||
### Tooling
|
||||
|
||||
Here are a couple of popular tools available for processing and filtering log files output in `logfmt` or `json` formats.
|
||||
|
||||
#### hutils
|
||||
|
||||
The [hutils](https://blog.heroku.com/hutils-explore-your-structured-data-logs) utility collection, provided by Heroku, provides tools for working with `logfmt`-encoded logs, including:
|
||||
|
||||
- **lcut**: Extracts values from a `logfmt` trace based on a specified field name.
|
||||
- **lfmt**: Prettifies `logfmt` lines as they emerge from a stream, and highlights their key sections.
|
||||
- **ltap**: Accesses messages from log providers in a consistent way to allow easy parsing by other utilities that operate on `logfmt` traces.
|
||||
- **lviz**: Visualizes `logfmt` output by building a tree out of a dataset combining common sets of key-value pairs into shared parent nodes.
|
||||
|
||||
#### lnav (Log File Navigator)
|
||||
|
||||
The [lnav (Log File Navigator)](http://lnav.org) is an advanced log file viewer useful for watching and analyzing your log files from a terminal.
|
||||
The lnav viewer provides a single log view, automatic log format detection, filtering, timeline view, pretty-print view, and querying logs using SQL.
|
||||
|
||||
### Operations
|
||||
|
||||
The following operations, listed by their operation name (`op_name`) are traced in InfluxDB internal logs and available for use without changes in logging level.
|
||||
|
||||
#### Initial opening of data files
|
||||
|
||||
The `tsdb_open` operation traces include all events related to the initial opening of the `tsdb_store`.
|
||||
|
||||
|
||||
#### Retention policy shard deletions
|
||||
|
||||
The `retention.delete_check` operation includes all shard deletions related to the retention policy.
|
||||
|
||||
#### TSM snapshotting in-memory cache to disk
|
||||
|
||||
The `tsm1_cache_snapshot` operation represents the snapshotting of the TSM in-memory cache to disk.
|
||||
|
||||
#### TSM compaction strategies
|
||||
|
||||
The `tsm1_compact_group` operation includes all trace log entries related to TSM compaction strategies and displays the related TSM compaction strategy keys:
|
||||
|
||||
- **tsm1\_strategy**: level or full
|
||||
- **tsm1\_level**: 1, 2, or 3
|
||||
- **tsm\_optimize**: true or false
|
||||
|
||||
#### Series file compactions
|
||||
|
||||
The `series_partition_compaction` operation includes all trace log entries related to series file compactions.
|
||||
|
||||
#### Continuous query execution (if logging enabled)
|
||||
|
||||
The `continuous_querier_execute` operation includes all continuous query executions, if logging is enabled.
|
||||
|
||||
#### TSI log file compaction
|
||||
|
||||
The `tsi1_compact_log_file` operation includes all trace log entries related to log file compactions.
|
||||
|
||||
#### TSI level compaction
|
||||
|
||||
The `tsi1_compact_to_level` operation includes all trace log entries for TSI level compactions.
|
||||
|
||||
|
||||
### Tracing examples
|
||||
|
||||
#### Finding all trace log entries for an InfluxDB operation
|
||||
|
||||
In the example below, you can see the log entries for all trace operations related to a "TSM compaction" process.
|
||||
Note that the initial entry shows the message "TSM compaction (start)" and the final entry displays the message "TSM compaction (end)".
|
||||
{{% note %}}
|
||||
Log entries were grepped using the `trace_id` value and then the specified key values were displayed using `lcut` (an `hutils` tool).
|
||||
{{% /note %}}\]
|
||||
|
||||
```
|
||||
$ grep "06QW92x0000" influxd.log | lcut ts lvl msg strategy level
|
||||
2018-02-21T20:18:56.880065Z info TSM compaction (start) full
|
||||
2018-02-21T20:18:56.880162Z info Beginning compaction full
|
||||
2018-02-21T20:18:56.880185Z info Compacting file full
|
||||
2018-02-21T20:18:56.880211Z info Compacting file full
|
||||
2018-02-21T20:18:56.880226Z info Compacting file full
|
||||
2018-02-21T20:18:56.880254Z info Compacting file full
|
||||
2018-02-21T20:19:03.928640Z info Compacted file full
|
||||
2018-02-21T20:19:03.928687Z info Finished compacting files full
|
||||
2018-02-21T20:19:03.928707Z info TSM compaction (end) full
|
||||
```
|
||||
|
||||
|
||||
#### Finding all starting operation log entries
|
||||
|
||||
To find all starting operation log entries, you can grep by values in `op_name` AND `op_event`.
|
||||
In the following example, the grep returned 101 entries, so the result below only displays the first entry.
|
||||
In the example result entry, the timestamp, level, strategy, trace_id, op_name, and op_event values are included.
|
||||
|
||||
```
|
||||
$ grep -F 'op_name=tsm1_compact_group' influxd.log | grep -F 'op_event=start'
|
||||
ts=2018-02-21T20:16:16.709953Z lvl=info msg="TSM compaction" log_id=06QVNNCG000 engine=tsm1 level=1 strategy=level trace_id=06QV~HHG000 op_name=tsm1_compact_group op_event=start
|
||||
...
|
||||
```
|
||||
|
||||
Using the `lcut` utility (in hutils), the following command uses the previous `grep` command, but adds an `lcut` command to only display the keys and their values for keys that are not identical in all of the entries.
|
||||
The following example includes 19 examples of unique log entries displaying selected keys: `ts`, `strategy`, `level`, and `trace_id`.
|
||||
|
||||
```
|
||||
$ grep -F 'op_name=tsm1_compact_group' influxd.log | grep -F 'op_event=start' | lcut ts strategy level trace_id | sort -u
|
||||
2018-02-21T20:16:16.709953Z level 1 06QV~HHG000
|
||||
2018-02-21T20:16:40.707452Z level 1 06QW0k0l000
|
||||
2018-02-21T20:17:04.711519Z level 1 06QW2Cml000
|
||||
2018-02-21T20:17:05.708227Z level 2 06QW2Gg0000
|
||||
2018-02-21T20:17:29.707245Z level 1 06QW3jQl000
|
||||
2018-02-21T20:17:53.711948Z level 1 06QW5CBl000
|
||||
2018-02-21T20:18:17.711688Z level 1 06QW6ewl000
|
||||
2018-02-21T20:18:56.880065Z full 06QW92x0000
|
||||
2018-02-21T20:20:46.202368Z level 3 06QWFizW000
|
||||
2018-02-21T20:21:25.292557Z level 1 06QWI6g0000
|
||||
2018-02-21T20:21:49.294272Z level 1 06QWJ_RW000
|
||||
2018-02-21T20:22:13.292489Z level 1 06QWL2B0000
|
||||
2018-02-21T20:22:37.292431Z level 1 06QWMVw0000
|
||||
2018-02-21T20:22:38.293320Z level 2 06QWMZqG000
|
||||
2018-02-21T20:23:01.293690Z level 1 06QWNygG000
|
||||
2018-02-21T20:23:25.292956Z level 1 06QWPRR0000
|
||||
2018-02-21T20:24:33.291664Z full 06QWTa2l000
|
||||
2018-02-21T21:12:08.017055Z full 06QZBpKG000
|
||||
2018-02-21T21:12:08.478200Z full 06QZBr7W000
|
||||
```
|
|
@ -0,0 +1,185 @@
|
|||
---
|
||||
title: Monitor InfluxDB Enterprise with InfluxDB Cloud
|
||||
description: >
|
||||
Monitor your InfluxDB Enterprise instance using InfluxDB Cloud and
|
||||
a pre-built InfluxDB template.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Monitor with Cloud
|
||||
parent: Monitor
|
||||
weight: 100
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/monitor-enterprise/monitor-with-cloud/
|
||||
---
|
||||
|
||||
Use [InfluxDB Cloud](/influxdb/cloud/), the [InfluxDB Enterprise 1.x Template](https://github.com/influxdata/community-templates/tree/master/influxdb-enterprise-1x), and Telegraf to monitor one or more InfluxDB Enterprise instances.
|
||||
|
||||
Do the following:
|
||||
|
||||
1. [Review requirements](#review-requirements)
|
||||
2. [Install the InfluxDB Enterprise Monitoring template](#install-the-influxdb-enterprise-monitoring-template)
|
||||
3. [Set up InfluxDB Enterprise for monitoring](#set-up-influxdb-enterprise-for-monitoring)
|
||||
4. [Set up Telegraf](#set-up-telegraf)
|
||||
5. [View the Monitoring dashboard](#view-the-monitoring-dashboard)
|
||||
6. (Optional) [Alert when metrics stop reporting](#alert-when-metrics-stop-reporting)
|
||||
7. (Optional) [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule)
|
||||
8. (Optional) [Monitor with InfluxDB Insights and Aware](#monitor-with-influxdb-insights-and-aware)
|
||||
|
||||
## Review requirements
|
||||
|
||||
Before you begin, make sure you have access to the following:
|
||||
|
||||
- An InfluxDB Cloud account. ([Sign up for free here](https://cloud2.influxdata.com/signup)).
|
||||
- Command line access to a machine [running InfluxDB Enterprise 1.x](/enterprise_influxdb/v1.10/introduction/install-and-deploy/) and permissions to install Telegraf on this machine.
|
||||
- Internet connectivity from the machine running InfluxDB Enterprise 1.x and Telegraf to InfluxDB Cloud.
|
||||
- Sufficient resource availability to install the template. (InfluxDB Cloud Free Plan accounts include a finite number of [available resources](/influxdb/cloud/account-management/limits/#free-plan-limits).)
|
||||
|
||||
## Install the InfluxDB Enterprise Monitoring template
|
||||
|
||||
The InfluxDB Enterprise Monitoring template includes a Telegraf configuration that sends InfluxDB Enterprise metrics to an InfluxDB endpoint, and a dashboard that visualizes the metrics.
|
||||
|
||||
1. [Log into your InfluxDB Cloud account](https://cloud2.influxdata.com/), go to **Settings > Templates**, and enter the following template URL:
|
||||
|
||||
```
|
||||
https://raw.githubusercontent.com/influxdata/community-templates/master/influxdb-enterprise-1x/enterprise.yml
|
||||
```
|
||||
|
||||
2. Click **Lookup Template**, and then click **Install Template**. InfluxDB Cloud imports the template, which includes the following resources:
|
||||
- Telegraf Configuration `monitoring-enterprise-1x`
|
||||
- Dashboard `InfluxDB 1.x Enterprise`
|
||||
- Label `enterprise`
|
||||
- Variables `influxdb_host` and `bucket`
|
||||
|
||||
## Set up InfluxDB Enterprise for monitoring
|
||||
|
||||
By default, InfluxDB Enterprise 1.x has a `/metrics` endpoint available, which exports Prometheus-style system metrics.
|
||||
|
||||
1. Make sure the `/metrics` endpoint is [enabled](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled). If you've changed the default settings to disable the `/metrics` endpoint, [re-enable these settings](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled).
|
||||
2. Navigate to the `/metrics` endpoint of your InfluxDB Enterprise instance to view the InfluxDB Enterprise system metrics in your browser:
|
||||
|
||||
```
|
||||
http://localhost:8086/metrics
|
||||
```
|
||||
|
||||
Or use `curl` to fetch metrics:
|
||||
|
||||
```sh
|
||||
curl http://localhost:8086/metrics
|
||||
# HELP boltdb_reads_total Total number of boltdb reads
|
||||
# TYPE boltdb_reads_total counter
|
||||
boltdb_reads_total 41
|
||||
# HELP boltdb_writes_total Total number of boltdb writes
|
||||
# TYPE boltdb_writes_total counter
|
||||
boltdb_writes_total 28
|
||||
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
|
||||
...
|
||||
```
|
||||
3. Add your **InfluxDB Cloud** account information (URL and organization) to your Telegraf configuration by doing the following:
|
||||
1. Go to **Load Data > Telegraf** [in your InfluxDB Cloud account](https://cloud2.influxdata.com/), and click **InfluxDB Output Plugin** at the top-right corner.
|
||||
2. Copy the `urls`, `token`, `organization`, and `bucket` and close the window.
|
||||
3. Click **monitoring-enterprise-1.x**.
|
||||
4. Replace `urls`, `token`, `organization`, and `bucket` under `outputs.influxdb_v2` with your InfluxDB Cloud account information. Alternatively, store this information in your environment variables and include the environment variables in your configuration.
|
||||
|
||||
{{% note %}}
|
||||
To ensure the InfluxDB Enterprise monitoring dashboard can display the recorded metrics, set the destination bucket name to `enterprise_metrics` in your `telegraf.conf`.
|
||||
{{% /note %}}
|
||||
|
||||
5. Add the [Prometheus input plugin](https://github.com/influxdata/telegraf/blob/release-1.19/plugins/inputs/prometheus/README.md) to your `telegraf.conf`. Specify your your InfluxDB Enterprise URL(s) in the `urls` parameter. For example:
|
||||
|
||||
{{< keep-url >}}
|
||||
```toml
|
||||
[[inputs.prometheus]]
|
||||
urls = ["http://localhost:8086/metrics"]
|
||||
username = "$INFLUX_USER"
|
||||
password = "$INFLUX_PASSWORD"
|
||||
```
|
||||
|
||||
If you're using unique URLs or have authentication set up for your `/metrics` endpoint, configure those options here and save the updated configuration.
|
||||
|
||||
For more information about customizing Telegraf, see [Configure Telegraf](/{{< latest "telegraf" >}}/administration/configuration/#global-tags).
|
||||
4. Click **Save Changes**.
|
||||
|
||||
## Set up Telegraf
|
||||
|
||||
Set up Telegraf to scrape metrics from InfluxDB Enterprise to send to your InfluxDB Cloud account.
|
||||
|
||||
On each InfluxDB Enterprise instance you want to monitor, do the following:
|
||||
|
||||
1. Go to **Load Data > Telegraf** [in your InfluxDB Cloud account](https://cloud2.influxdata.com/).
|
||||
2. Click **Setup Instructions** under **monitoring-enterprise-1.x**.
|
||||
3. Complete the Telegraf Setup instructions. If you are using environment variables, set them up now.
|
||||
|
||||
{{% note %}}
|
||||
For your API token, generate a new token or use an existing All Access token. If you run Telegraf as a service, edit your init script to set the environment variable and ensure that it's available to the service.
|
||||
{{% /note %}}
|
||||
|
||||
Telegraf runs quietly in the background (no immediate output appears), and Telegraf begins pushing metrics to your InfluxDB Cloud account.
|
||||
|
||||
## View the Monitoring dashboard
|
||||
|
||||
To see your data in real time, view the Monitoring dashboard.
|
||||
|
||||
1. Select **Boards** (**Dashboards**) in your **InfluxDB Cloud** account.
|
||||
|
||||
{{< nav-icon "dashboards" >}}
|
||||
|
||||
2. Click **InfluxDB Enterprise Metrics**. Metrics appear in your dashboard.
|
||||
3. Customize your monitoring dashboard as needed. For example, send an alert in the following cases:
|
||||
- Users create a new task or bucket
|
||||
- You're testing machine limits
|
||||
- [Metrics stop reporting](#alert-when-metrics-stop-reporting)
|
||||
|
||||
## Alert when metrics stop reporting
|
||||
|
||||
The Monitoring template includes a [deadman check](/influxdb/cloud/monitor-alert/checks/create/#deadman-check) to verify metrics are reported at regular intervals.
|
||||
|
||||
To alert when data stops flowing from InfluxDB OSS instances to your InfluxDB Cloud account, do the following:
|
||||
|
||||
1. [Customize the deadman check](#customize-the-deadman-check) to identify the fields you want to monitor.
|
||||
2. [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule) to receive notifications when your deadman check is triggered.
|
||||
|
||||
### Customize the deadman check
|
||||
|
||||
1. To view the deadman check, click **Alerts** in the navigation bar of your **InfluxDB Cloud** account.
|
||||
|
||||
{{< nav-icon "alerts" >}}
|
||||
|
||||
2. Choose a InfluxDB OSS field or create a new OSS field for your deadman alert:
|
||||
1. Click **{{< icon "plus" "v2" >}} Create** and select **Deadman Check** in the dropown menu.
|
||||
2. Define your query with at least one field.
|
||||
3. Click **Submit** and **Configure Check**.
|
||||
When metrics stop reporting, you'll receive an alert.
|
||||
3. Start under **Schedule Every**, set the amount of time to check for data.
|
||||
4. Set the amount of time to wait before switching to a critical alert.
|
||||
5. Save the Check and click on **View History** of the Check under the gear icon to verify it is running.
|
||||
|
||||
## Create a notification endpoint and rule
|
||||
|
||||
To receive a notification message when your deadman check is triggered, create a [notification endpoint](#create-a-notification-endpoint) and [rule](#create-a-notification-rule).
|
||||
|
||||
### Create a notification endpoint
|
||||
|
||||
InfluxDB Cloud supports different endpoints: Slack, PagerDuty, and HTTP. Slack is free for all users, while PagerDuty and HTTP are exclusive to the Usage-Based Plan.
|
||||
|
||||
#### Send a notification to Slack
|
||||
|
||||
1. Create a [Slack Webhooks](https://api.slack.com/messaging/webhooks).
|
||||
2. Go to **Alerts > Notification Endpoint** and click **{{< icon "plus" "v2" >}} Create**, and enter a name and description for your Slack endpoint.
|
||||
3. Enter your Slack Webhook under **Incoming Webhook URL** and click **Create Notification Endpoint**.
|
||||
|
||||
#### Send a notification to PagerDuty or HTTP
|
||||
|
||||
Send a notification to PagerDuty or HTTP endpoints (other webhooks) by [upgrading your InfluxDB Cloud account](/influxdb/cloud/account-management/billing/#upgrade-to-usage-based-plan).
|
||||
|
||||
### Create a notification rule
|
||||
|
||||
[Create a notification rule](/influxdb/cloud/monitor-alert/notification-rules/create/) to set rules for when to send a deadman alert message to your notification endpoint.
|
||||
|
||||
1. Go to **Alerts > Notification Rules** and click **{{< icon "plus" "v2" >}} Create**.
|
||||
2. Fill out the **About** and **Conditions** section then click **Create Notification Rule**.
|
||||
|
||||
## Monitor with InfluxDB Insights and Aware
|
||||
|
||||
For InfluxDB Enterprise customers, Insights and Aware are free services that can monitor your data. InfluxDB Insights sends your data to a private Cloud account and will be monitored with the help of the support team. InfluxDB Aware is a similar service, but you monitor your data yourself.
|
||||
|
||||
To apply for this service, please contact the [InfluxData Support team](mailto:support@influxdata.com).
|
|
@ -0,0 +1,181 @@
|
|||
---
|
||||
title: Monitor InfluxDB Enterprise with InfluxDB OSS
|
||||
description: >
|
||||
Monitor your InfluxDB Enterprise instance using InfluxDB OSS and
|
||||
a pre-built InfluxDB template.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Monitor with OSS
|
||||
parent: Monitor
|
||||
weight: 101
|
||||
related:
|
||||
- /platform/monitoring/influxdata-platform/tools/measurements-internal
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/administration/monitor-enterprise/monitor-with-oss/
|
||||
---
|
||||
|
||||
Use [InfluxDB OSS](/influxdb/v2.0/), the [InfluxDB Enterprise 1.x Template](https://github.com/influxdata/community-templates/tree/master/influxdb-enterprise-1x), and Telegraf to monitor one or more InfluxDB Enterprise instances.
|
||||
|
||||
Do the following:
|
||||
|
||||
1. [Review requirements](#review-requirements)
|
||||
2. [Install the InfluxDB Enterprise Monitoring template](#install-the-influxdb-enterprise-monitoring-template)
|
||||
3. [Set up InfluxDB Enterprise for monitoring](#set-up-influxdb-enterprise-for-monitoring)
|
||||
4. [Set up Telegraf](#set-up-telegraf)
|
||||
5. [View the Monitoring dashboard](#view-the-monitoring-dashboard)
|
||||
6. (Optional) [Alert when metrics stop reporting](#alert-when-metrics-stop-reporting)
|
||||
7. (Optional) [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule)
|
||||
8. (Optional) [Monitor with InfluxDB Insights and Aware](#monitor-with-influxdb-insights-and-aware)
|
||||
|
||||
## Review requirements
|
||||
|
||||
Before you begin, make sure you have access to the following:
|
||||
|
||||
- A self-hosted OSS 2.x instance. ([get started for free here](/influxdb/v2.0/get-started/))
|
||||
- Command line access to a machine [running InfluxDB Enterprise 1.x](/enterprise_influxdb/v1.10/introduction/install-and-deploy/) and permissions to install Telegraf on this machine.
|
||||
- Internet connectivity from the machine running InfluxDB Enterprise 1.x and Telegraf to InfluxDB OSS.
|
||||
- Sufficient resource availability to install the template.
|
||||
|
||||
## Install the InfluxDB Enterprise Monitoring template
|
||||
|
||||
The InfluxDB Enterprise Monitoring template includes a Telegraf configuration that sends InfluxDB Enterprise metrics to an InfluxDB endpoint and a dashboard that visualizes the metrics.
|
||||
|
||||
1. [Log into your InfluxDB OSS UI](http://localhost:8086/signin), go to **Settings > Templates**, and enter the following template URL:
|
||||
|
||||
```
|
||||
https://raw.githubusercontent.com/influxdata/community-templates/master/influxdb-enterprise-1x/enterprise.yml
|
||||
```
|
||||
|
||||
2. Click **Lookup Template**, and then click **Install Template**. InfluxDB OSS imports the template, which includes the following resources:
|
||||
- Telegraf Configuration `monitoring-enterprise-1x`
|
||||
- Dashboard `InfluxDB 1.x Enterprise`
|
||||
- Label `enterprise`
|
||||
- Variables `influxdb_host` and `bucket`
|
||||
|
||||
## Set up InfluxDB Enterprise for monitoring
|
||||
|
||||
By default, InfluxDB Enterprise 1.x has a `/metrics` endpoint available, which exports Prometheus-style system metrics.
|
||||
|
||||
1. Make sure the `/metrics` endpoint is [enabled](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled). If you've changed the default settings to disable the `/metrics` endpoint, [re-enable these settings](/{{< latest "influxdb" >}}/reference/config-options/#metrics-disabled).
|
||||
2. Navigate to the `/metrics` endpoint of your InfluxDB Enterprise instance to view the InfluxDB Enterprise system metrics in your browser:
|
||||
|
||||
```
|
||||
http://localhost:8086/metrics
|
||||
```
|
||||
|
||||
Or use `curl` to fetch metrics:
|
||||
|
||||
```sh
|
||||
curl http://localhost:8086/metrics
|
||||
# HELP boltdb_reads_total Total number of boltdb reads
|
||||
# TYPE boltdb_reads_total counter
|
||||
boltdb_reads_total 41
|
||||
# HELP boltdb_writes_total Total number of boltdb writes
|
||||
# TYPE boltdb_writes_total counter
|
||||
boltdb_writes_total 28
|
||||
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
|
||||
...
|
||||
```
|
||||
3. Add your **InfluxDB OSS** account information (URL and organization) to your Telegraf configuration by doing the following:
|
||||
1. Go to **Load Data > Telegraf** [in your InfluxDB OSS account](http://localhost:8086/), and click **InfluxDB Output Plugin** at the top-right corner.
|
||||
2. Copy the `urls`, `token`, `organization`, and `bucket` and close the window.
|
||||
3. Click **monitoring-enterprise-1.x**.
|
||||
4. Replace `urls`, `token`, `organization`, and `bucket` under `outputs.influxdb_v2` with your InfluxDB OSS account information. Alternatively, store this information in your environment variables and include the environment variables in your configuration.
|
||||
|
||||
{{% note %}}
|
||||
To ensure the InfluxDB Enterprise monitoring dashboard can display the recorded metrics, set the destination bucket name to `enterprise_metrics` in your `telegraf.conf`.
|
||||
{{% /note %}}
|
||||
|
||||
5. Add the [Prometheus input plugin](https://github.com/influxdata/telegraf/blob/release-1.19/plugins/inputs/prometheus/README.md) to your `telegraf.conf`. Specify your your InfluxDB Enterprise URL(s) in the `urls` parameter. For example:
|
||||
|
||||
{{< keep-url >}}
|
||||
```toml
|
||||
[[inputs.prometheus]]
|
||||
urls = ["http://localhost:8086/metrics"]
|
||||
username = "$INFLUX_USER"
|
||||
password = "$INFLUX_PASSWORD"
|
||||
```
|
||||
|
||||
If you're using unique URLs or have security set up for your `/metrics` endpoint, configure those options here and save the updated configuration.
|
||||
|
||||
For more information about customizing Telegraf, see [Configure Telegraf](/{{< latest "telegraf" >}}/administration/configuration/#global-tags).
|
||||
4. Click **Save Changes**.
|
||||
|
||||
## Set up Telegraf
|
||||
|
||||
Set up Telegraf to scrape metrics from InfluxDB Enterprise to send to your InfluxDB OSS account.
|
||||
|
||||
On each InfluxDB Enterprise instance you want to monitor, do the following:
|
||||
|
||||
1. Go to **Load Data > Telegraf** [in your InfluxDB OSS account](http://localhost:8086/signin).
|
||||
2. Click **Setup Instructions** under **monitoring-enterprise-1.x**.
|
||||
3. Complete the Telegraf Setup instructions. If you are using environment variables, set them up now.
|
||||
|
||||
{{% note %}}
|
||||
For your API token, generate a new token or use an existing All Access token. If you run Telegraf as a service, edit your init script to set the environment variable and ensure its available to the service.
|
||||
{{% /note %}}
|
||||
|
||||
Telegraf runs quietly in the background (no immediate output appears), and Telegraf begins pushing metrics to your InfluxDB OSS account.
|
||||
|
||||
## View the Monitoring dashboard
|
||||
|
||||
To see your data in real time, view the Monitoring dashboard.
|
||||
|
||||
1. Select **Boards** (**Dashboards**) in your **InfluxDB OSS** account.
|
||||
|
||||
{{< nav-icon "dashboards" >}}
|
||||
|
||||
2. Click **InfluxDB Enterprise Metrics**. Metrics appear in your dashboard.
|
||||
3. Customize your monitoring dashboard as needed. For example, send an alert in the following cases:
|
||||
- Users create a new task or bucket
|
||||
- You're testing machine limits
|
||||
- [Metrics stop reporting](#alert-when-metrics-stop-reporting)
|
||||
|
||||
## Alert when metrics stop reporting
|
||||
|
||||
The Monitoring template includes a [deadman check](/influxdb/v2.0/monitor-alert/checks/create/#deadman-check) to verify metrics are reported at regular intervals.
|
||||
|
||||
To alert when data stops flowing from InfluxDB OSS instances to your InfluxDB OSS account, do the following:
|
||||
|
||||
1. [Customize the deadman check](#customize-the-deadman-check) to identify the fields you want to monitor.
|
||||
2. [Create a notification endpoint and rule](#create-a-notification-endpoint-and-rule) to receive notifications when your deadman check is triggered.
|
||||
|
||||
### Customize the deadman check
|
||||
|
||||
1. To view the deadman check, click **Alerts** in the navigation bar of your **InfluxDB OSS** account.
|
||||
|
||||
{{< nav-icon "alerts" >}}
|
||||
|
||||
2. Choose a InfluxDB OSS field or create a new OSS field for your deadman alert:
|
||||
1. Click **{{< icon "plus" "v2" >}} Create** and select **Deadman Check** in the dropown menu.
|
||||
2. Define your query with at least one field.
|
||||
3. Click **Submit** and **Configure Check**.
|
||||
When metrics stop reporting, you'll receive an alert.
|
||||
3. Start under **Schedule Every**, set the amount of time to check for data.
|
||||
4. Set the amount of time to wait before switching to a critical alert.
|
||||
5. Save the Check and click on **View History** of the Check under the gear icon to verify it is running.
|
||||
|
||||
## Create a notification endpoint and rule
|
||||
|
||||
To receive a notification message when your deadman check is triggered, create a [notification endpoint](#create-a-notification-endpoint) and [rule](#create-a-notification-rule).
|
||||
|
||||
### Create a notification endpoint
|
||||
|
||||
InfluxData supports different endpoints: Slack, PagerDuty, and HTTP. Slack is free for all users, while PagerDuty and HTTP are exclusive to the Usage-Based Plan.
|
||||
|
||||
#### Send a notification to Slack
|
||||
|
||||
1. Create a [Slack Webhooks](https://api.slack.com/messaging/webhooks).
|
||||
2. Go to **Alerts > Notification Endpoint** and click **{{< icon "plus" "v2" >}} Create**, and enter a name and description for your Slack endpoint.
|
||||
3. Enter your Slack Webhook under **Incoming Webhook URL** and click **Create Notification Endpoint**.
|
||||
|
||||
#### Send a notification to PagerDuty or HTTP
|
||||
|
||||
Send a notification to PagerDuty or HTTP endpoints (other webhooks) by [upgrading your InfluxDB OSS account](/influxdb/v2.0/reference/cli/influxd/upgrade/).
|
||||
|
||||
### Create a notification rule
|
||||
|
||||
[Create a notification rule](/influxdb/v2.0/monitor-alert/notification-rules/create/) to set rules for when to send a deadman alert message to your notification endpoint.
|
||||
|
||||
1. Go to **Alerts > Notification Rules** and click **{{< icon "plus" "v2" >}} Create**.
|
||||
2. Fill out the **About** and **Conditions** section then click **Create Notification Rule**.
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
title: Renew or update a license key or file
|
||||
description: >
|
||||
Renew or update a license key or file for your InfluxDB enterprise cluster.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Renew a license
|
||||
weight: 50
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
Use this procedure to renew or update an existing license key or file, switch from a license key to a license file, or switch from a license file to a license key.
|
||||
|
||||
> **Note:** To request a new license to renew or expand your InfluxDB Enterprise cluster, contact [sales@influxdb.com](mailto:sales@influxdb.com).
|
||||
|
||||
To update a license key or file, do the following:
|
||||
|
||||
1. If you are switching from a license key to a license file (or vice versa), delete your existing license key or file.
|
||||
2. **Add the license key or file** to your [meta nodes](/enterprise_influxdb/v1.10/administration/config-meta-nodes/#enterprise-license-settings) and [data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes/#enterprise-license-settings) configuration settings. For more information, see [how to configure InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/configuration/).
|
||||
3. **On each meta node**, run `service influxdb-meta restart`, and wait for the meta node service to come back up successfully before restarting the next meta node.
|
||||
The cluster should remain unaffected as long as only one node is restarting at a time.
|
||||
4. **On each data node**, run `killall -s HUP influxd` to signal the `influxd` process to reload its configuration file.
|
|
@ -0,0 +1,29 @@
|
|||
---
|
||||
title: Stability and compatibility
|
||||
description: >
|
||||
API and storage engine compatibility and stability in InfluxDB OSS.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 90
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
## 1.x API compatibility and stability
|
||||
|
||||
One of the more important aspects of the 1.0 release is that this marks the stabilization of our API and storage format. Over the course of the last three years we’ve iterated aggressively, often breaking the API in the process. With the release of 1.0 and for the entire 1.x line of releases we’re committing to the following:
|
||||
|
||||
### No breaking InfluxDB API changes
|
||||
|
||||
When it comes to the InfluxDB API, if a command works in 1.0 it will work unchanged in all 1.x releases...with one caveat. We will be adding [keywords](/enterprise_influxdb/v1.10/query_language/spec/#keywords) to the query language. New keywords won't break your queries if you wrap all [identifiers](/enterprise_influxdb/v1.10/concepts/glossary/#identifier) in double quotes and all string literals in single quotes. This is generally considered best practice so it should be followed anyway. For users following that guideline, the query and ingestion APIs will have no breaking changes for all 1.x releases. Note that this does not include the Go code in the project. The underlying Go API in InfluxDB can and will change over the course of 1.x development. Users should be accessing InfluxDB through the [InfluxDB API](/enterprise_influxdb/v1.10/tools/api/).
|
||||
|
||||
### Storage engine stability
|
||||
|
||||
The [TSM](/enterprise_influxdb/v1.10/concepts/glossary/#tsm-time-structured-merge-tree) storage engine file format is now at version 1. While we may introduce new versions of the format in the 1.x releases, these new versions will run side-by-side with previous versions. What this means for users is there will be no lengthy migrations when upgrading from one 1.x release to another.
|
||||
|
||||
### Additive changes
|
||||
|
||||
The query engine will have additive changes over the course of the new releases. We’ll introduce new query functions and new functionality into the language without breaking backwards compatibility. We may introduce new protocol endpoints (like a binary format) and versions of the line protocol and query API to improve performance and/or functionality, but they will have to run in parallel with the existing versions. Existing versions will be supported for the entirety of the 1.x release line.
|
||||
|
||||
### Ongoing support
|
||||
|
||||
We’ll continue to fix bugs on the 1.x versions of the [line protocol](/enterprise_influxdb/v1.10/concepts/glossary/#influxdb-line-protocol), query API, and TSM storage format. Users should expect to upgrade to the latest 1.x.x release for bug fixes, but those releases will all be compatible with the 1.0 API and won’t require data migrations. For instance, if a user is running 1.2 and there are bug fixes released in 1.3, they should upgrade to the 1.3 release. Until 1.4 is released, patch fixes will go into 1.3.x. Because all future 1.x releases are drop in replacements for previous 1.x releases, users should upgrade to the latest in the 1.x line to get all bug fixes.
|
|
@ -0,0 +1,308 @@
|
|||
---
|
||||
title: Upgrade InfluxDB Enterprise clusters
|
||||
description: Upgrade to the latest version of InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.10/administration/upgrading/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Upgrade
|
||||
weight: 50
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
To successfully perform a rolling upgrade of InfluxDB Enterprise clusters to {{< latest-patch >}}, complete the following steps:
|
||||
|
||||
1. [Back up your cluster](#back-up-your-cluster).
|
||||
2. [Upgrade meta nodes](#upgrade-meta-nodes).
|
||||
3. [Upgrade data nodes](#upgrade-data-nodes).
|
||||
|
||||
> ***Note:*** A rolling upgrade lets you update your cluster with zero downtime. To downgrade to an earlier version, complete the following procedures, replacing the version numbers with the version that you want to downgrade to.
|
||||
|
||||
## Back up your cluster
|
||||
|
||||
Before performing an upgrade, create a full backup of your InfluxDB Enterprise cluster. Also, if you create incremental backups, trigger a final incremental backup.
|
||||
|
||||
> ***Note:*** For information on performing a final incremental backup or a full backup,
|
||||
> see [Back up and restore InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/backup-and-restore/).
|
||||
|
||||
## Upgrade meta nodes
|
||||
|
||||
Complete the following steps to upgrade meta nodes:
|
||||
|
||||
1. [Download the meta node package](#download-the-meta-node-package).
|
||||
2. [Install the meta node package](#install-the-meta-node-package).
|
||||
3. [Update the meta node configuration file](#update-the-meta-node-configuration-file).
|
||||
4. [Restart the `influxdb-meta` service](#restart-the-influxdb-meta-service).
|
||||
5. Repeat steps 1-4 for each meta node in your cluster.
|
||||
6. [Confirm the meta nodes upgrade](#confirm-the-meta-nodes-upgrade).
|
||||
|
||||
### Download the meta node package
|
||||
|
||||
##### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat and CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Install the meta node package
|
||||
|
||||
##### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
sudo dpkg -i influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat and CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
sudo yum localinstall influxdb-meta-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Update the meta node configuration file
|
||||
|
||||
Migrate any custom settings from your previous meta node configuration file.
|
||||
|
||||
To enable HTTPS, you must update the meta node configuration file (`influxdb-meta.conf`). For information, see [Enable HTTPS within the configuration file for each Meta Node](/enterprise_influxdb/v1.10/guides/https_setup/#set-up-https-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
### Restart the `influxdb-meta` service
|
||||
|
||||
##### sysvinit systems
|
||||
|
||||
```bash
|
||||
service influxdb-meta restart
|
||||
```
|
||||
|
||||
##### systemd systems
|
||||
|
||||
```bash
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
|
||||
### Confirm the meta nodes upgrade
|
||||
|
||||
After upgrading _**all**_ meta nodes, check your node version numbers using the
|
||||
`influxd-ctl show` command.
|
||||
The [`influxd-ctl` utility](/enterprise_influxdb/v1.10/tools/influxd-ctl/) is available on all meta nodes.
|
||||
|
||||
```bash
|
||||
~# influxd-ctl show
|
||||
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 rk-upgrading-01:8088 1.8.x_c1.8.y
|
||||
5 rk-upgrading-02:8088 1.8.x_c1.8.y
|
||||
6 rk-upgrading-03:8088 1.8.x_c1.8.y
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
rk-upgrading-01:8091 {{< latest-patch >}}-c{{< latest-patch >}} # {{< latest-patch >}}-c{{< latest-patch >}} = 👍
|
||||
rk-upgrading-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
rk-upgrading-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
```
|
||||
|
||||
Ensure that the meta cluster is healthy before upgrading the data nodes.
|
||||
|
||||
## Upgrade data nodes
|
||||
|
||||
Complete the following steps to upgrade data nodes:
|
||||
|
||||
1. [Stop traffic to data nodes](#stop-traffic-to-the-data-node).
|
||||
2. [Download the data node package](#download-the-data-node-package).
|
||||
3. [Install the data node package](#install-the-data-node-package).
|
||||
4. [Update the data node configuration file](#update-the-data-node-configuration-file).
|
||||
5. For Time Series Index (TSI) only. [Rebuild TSI indexes](#rebuild-tsi-indexes).
|
||||
6. [Restart the `influxdb` service](#restart-the-influxdb-service).
|
||||
7. [Restart traffic to data nodes](#restart-traffic-to-data-nodes).
|
||||
8. Repeat steps 1-7 for each data node in your cluster.
|
||||
9. [Confirm the data nodes upgrade](#confirm-the-data-nodes-upgrade).
|
||||
|
||||
### Stop traffic to the data node
|
||||
To stop traffic to data nodes, **do one of the following:**
|
||||
|
||||
- **Disable traffic to data nodes in the node balancer**
|
||||
|
||||
- If you have access to the load balancer configuration, use your load balancer to stop routing read and write requests to the data node server (port 8086).
|
||||
- If you cannot access the load balancer configuration, work with your networking team to prevent traffic to the data node server before continuing to upgrade.
|
||||
|
||||
{{% note %}}
|
||||
Disabling traffic to a data node in the load balancer still allows other data
|
||||
node in the cluster to write to current data node.
|
||||
{{% /note %}}
|
||||
|
||||
- **Stop the `influxdb` service on the data node**
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
service influxdb stop
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo systemctl stop influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
{{% note %}}
|
||||
Stopping the `influxdb` process the data node takes longer than disabling
|
||||
traffic at the load balancer, but it ensures all writes stop, including writes
|
||||
from other data nodes in the cluster.
|
||||
{{% /note %}}
|
||||
|
||||
### Download the data node package
|
||||
|
||||
##### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat and CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Install the data node package
|
||||
|
||||
When you run the install command, you're prompted to keep or overwrite your
|
||||
current configuration file with the file for version {{< latest-patch >}}.
|
||||
Enter `N` or `O` to keep your current configuration file.
|
||||
You'll make the configuration changes for version {{< latest-patch >}} in the
|
||||
next procedure, [Update the data node configuration file](#update-the-data-node-configuration-file).
|
||||
|
||||
|
||||
##### Ubuntu & Debian (64-bit)
|
||||
|
||||
```bash
|
||||
sudo dpkg -i influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat & CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
sudo yum localinstall influxdb-data-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Update the data node configuration file
|
||||
|
||||
Migrate any custom settings from your previous data node configuration file.
|
||||
|
||||
- To enable HTTPS, see [Enable HTTPS within the configuration file for each Data Node](/enterprise_influxdb/v1.10/guides/https_setup/#set-up-https-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
- To enable TSI, open `/etc/influxdb/influxdb.conf`, and then adjust and save the settings shown in the following table.
|
||||
|
||||
| Section | Setting |
|
||||
| --------| ----------------------------------------------------------|
|
||||
| `[data]` | <ul><li>To use Time Series Index (TSI) disk-based indexing, add [`index-version = "tsi1"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#index-version-inmem) <li>To use TSM in-memory index, add [`index-version = "inmem"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#index-version-inmem) <li>Add [`wal-fsync-delay = "0s"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#wal-fsync-delay-0s) <li>Add [`max-concurrent-compactions = 0`](/enterprise_influxdb/v1.10/administration/config-data-nodes#max-concurrent-compactions-0)<li>Set[`cache-max-memory-size`](/enterprise_influxdb/v1.10/administration/config-data-nodes#cache-max-memory-size-1g) to `1073741824` |
|
||||
| `[cluster]`| <ul><li>Add [`pool-max-idle-streams = 100`](/enterprise_influxdb/v1.10/administration/config-data-nodes#pool-max-idle-streams-100) <li>Add[`pool-max-idle-time = "1m0s"`](/enterprise_influxdb/v1.10/administration/config-data-nodes#pool-max-idle-time-60s) <li>Remove `max-remote-write-connections`
|
||||
|[`[anti-entropy]`](/enterprise_influxdb/v1.10/administration/config-data-nodes#anti-entropy)| <ul><li>Add `enabled = true` <li>Add `check-interval = "30s"` <li>Add `max-fetch = 10`|
|
||||
|`[admin]`| Remove entire section.|
|
||||
|
||||
For more information about TSI, see [TSI overview](/enterprise_influxdb/v1.10/concepts/time-series-index/) and [TSI details](/enterprise_influxdb/v1.10/concepts/tsi-details/).
|
||||
|
||||
### Rebuild TSI indexes
|
||||
|
||||
Complete the following steps for Time Series Index (TSI) only.
|
||||
|
||||
1. Delete all `_series` directories in the `/data` directory (by default, stored at `/data/<dbName>/_series`).
|
||||
|
||||
2. Delete all TSM-based shard `index` directories (by default, located at `/data/<dbName/<rpName>/<shardID>/index`).
|
||||
|
||||
3. Use the [`influx_inspect buildtsi`](/enterprise_influxdb/v1.10/tools/influx_inspect#buildtsi) utility to rebuild the TSI index. For example, run the following command:
|
||||
|
||||
```js
|
||||
influx_inspect buildtsi -datadir /yourDataDirectory -waldir /wal
|
||||
```
|
||||
|
||||
Replacing `yourDataDirectory` with the name of your directory. Running this command converts TSM-based shards to TSI shards or rebuilds existing TSI shards.
|
||||
|
||||
> **Note:** Run the `buildtsi` command using the same system user that runs the `influxd` service, or a user with the same permissions.
|
||||
|
||||
### Restart the `influxdb` service
|
||||
|
||||
Restart the `influxdb` service to restart the data nodes.
|
||||
Do one of the following:
|
||||
|
||||
- **If the `influxdb` service is still running**, but isn't receiving traffic from the load balancer:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
service influxdb restart
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo systemctl restart influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
- **If the `influxdb` service is stopped**:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
service influxdb start
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo systemctl start influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
### Restart traffic to data nodes
|
||||
|
||||
Restart routing read and write requests to the data node server (port 8086) through your load balancer.
|
||||
|
||||
> **Note:** Allow the hinted handoff queue (HHQ) to write all missed data to the updated node before upgrading the next data node. Once all data has been written, the disk space used in the hinted handoff queue should be 0. Check the disk space on your hh directory by running the [`du`] command, for example, `du /var/lib/influxdb/hh`.
|
||||
|
||||
### Confirm the data nodes upgrade
|
||||
|
||||
After upgrading _**all**_ data nodes, check your node version numbers using the
|
||||
`influxd-ctl show` command.
|
||||
The [`influxd-ctl` utility](/enterprise_influxdb/v1.10/tools/influxd-ctl/) is available on all meta nodes.
|
||||
|
||||
```bash
|
||||
~# influxd-ctl show
|
||||
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 rk-upgrading-01:8088 {{< latest-patch >}}-c{{< latest-patch >}} # {{< latest-patch >}}-c{{< latest-patch >}} = 👍
|
||||
5 rk-upgrading-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
6 rk-upgrading-03:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
rk-upgrading-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
rk-upgrading-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
rk-upgrading-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
```
|
||||
|
||||
If you have any issues upgrading your cluster, contact InfluxData support.
|
|
@ -0,0 +1,12 @@
|
|||
---
|
||||
title: InfluxDB Enterprise concepts
|
||||
description: Clustering and other key concepts in InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.10/concepts/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Concepts
|
||||
weight: 50
|
||||
---
|
||||
|
||||
{{< children hlevel="h2" type="list" >}}
|
|
@ -0,0 +1,141 @@
|
|||
---
|
||||
title: Clustering in InfluxDB Enterprise
|
||||
description: >
|
||||
Learn how meta nodes and data nodes interact in InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.10/concepts/clustering/
|
||||
- /enterprise_influxdb/v1.10/high_availability/clusters/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Clustering
|
||||
weight: 10
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
This document describes in detail how clustering works in InfluxDB Enterprise.
|
||||
It starts with a high level description of the different components of a cluster
|
||||
and then outlines implementation details.
|
||||
|
||||
## Architectural overview
|
||||
|
||||
An InfluxDB Enterprise installation consists of two groups of software processes: data nodes and meta nodes.
|
||||
Communication within a cluster looks like this:
|
||||
|
||||
{{< diagram >}}
|
||||
flowchart TB
|
||||
subgraph meta[Meta Nodes]
|
||||
Meta1 <-- TCP :8089 --> Meta2 <-- TCP :8089 --> Meta3
|
||||
end
|
||||
meta <-- HTTP :8091 --> data
|
||||
subgraph data[Data Nodes]
|
||||
Data1 <-- TCP :8088 --> Data2
|
||||
end
|
||||
{{< /diagram >}}
|
||||
|
||||
The meta nodes communicate with each other via a TCP protocol and the Raft consensus protocol that all use port `8089` by default. This port must be reachable between the meta nodes. The meta nodes also expose an HTTP API bound to port `8091` by default that the `influxd-ctl` command uses.
|
||||
|
||||
Data nodes communicate with each other through a TCP protocol that is bound to port `8088`. Data nodes communicate with the meta nodes through their HTTP API bound to `8091`. These ports must be reachable between the meta and data nodes.
|
||||
|
||||
Within a cluster, all meta nodes must communicate with all other meta nodes. All data nodes must communicate with all other data nodes and all meta nodes.
|
||||
|
||||
The meta nodes keep a consistent view of the metadata that describes the cluster. The meta cluster uses the [HashiCorp implementation of Raft](https://github.com/hashicorp/raft) as the underlying consensus protocol. This is the same implementation that they use in Consul.
|
||||
|
||||
The data nodes replicate data and query each other using the Protobuf protocol over TCP. Details on replication and querying are covered later in this document.
|
||||
|
||||
## Where data lives
|
||||
|
||||
The meta and data nodes are each responsible for different parts of the database.
|
||||
|
||||
### Meta nodes
|
||||
|
||||
Meta nodes hold all of the following meta data:
|
||||
|
||||
* all nodes in the cluster and their role
|
||||
* all databases and retention policies that exist in the cluster
|
||||
* all shards and shard groups, and on what nodes they exist
|
||||
* cluster users and their permissions
|
||||
* all continuous queries
|
||||
|
||||
The meta nodes keep this data in the Raft database on disk, backed by BoltDB. By default the Raft database is `/var/lib/influxdb/meta/raft.db`.
|
||||
|
||||
> **Note:** Meta nodes require the `/meta` directory.
|
||||
|
||||
### Data nodes
|
||||
|
||||
Data nodes hold all of the raw time series data and metadata, including:
|
||||
|
||||
* measurements
|
||||
* tag keys and values
|
||||
* field keys and values
|
||||
|
||||
On disk, the data is always organized by `<database>/<retention_policy>/<shard_id>`. By default the parent directory is `/var/lib/influxdb/data`.
|
||||
|
||||
> **Note:** Data nodes require all four subdirectories of `/var/lib/influxdb/`, including `/meta` (specifically, the clients.json file), `/data`, `/wal`, and `/hh`.
|
||||
|
||||
## Optimal server counts
|
||||
|
||||
When creating a cluster, you need to decide how many meta and data nodes to configure and connect. You can think of InfluxDB Enterprise as two separate clusters that communicate with each other: a cluster of meta nodes and one of data nodes. The number of meta nodes is driven by the number of meta node failures they need to be able to handle, while the number of data nodes scales based on your storage and query needs.
|
||||
|
||||
The Raft consensus protocol requires a quorum to perform any operation, so there should always be an odd number of meta nodes. For almost all applications, 3 meta nodes is what you want. It gives you an odd number of meta nodes so that a quorum can be reached. And, if one meta node is lost, the cluster can still operate with the remaining 2 meta nodes until the third one is replaced. Additional meta nodes exponentially increases the communication overhead and is not recommended unless you expect the cluster to frequently lose meta nodes.
|
||||
|
||||
Data nodes hold the actual time series data. The minimum number of data nodes to run is 1 and can scale up from there. **Generally, you'll want to run a number of data nodes that is evenly divisible by your replication factor.** For instance, if you have a replication factor of 2, you'll want to run 2, 4, 6, 8, 10, etc. data nodes.
|
||||
|
||||
## Chronograf
|
||||
|
||||
[Chronograf](/{{< latest "chronograf" >}}/introduction/getting-started/) is the user interface component of InfluxData’s TICK stack.
|
||||
It makes owning the monitoring and alerting for your infrastructure easy to setup and maintain.
|
||||
It talks directly to the data and meta nodes over their HTTP protocols, which are bound by default to ports `8086` for data nodes and port `8091` for meta nodes.
|
||||
|
||||
## Writes in a cluster
|
||||
|
||||
This section describes how writes in a cluster work. We'll work through some examples using a cluster of four data nodes: `A`, `B`, `C`, and `D`. Assume that we have a retention policy with a replication factor of 2 with shard durations of 1 day.
|
||||
|
||||
### Shard groups
|
||||
|
||||
The cluster creates shards within a shard group to maximize the number of data nodes utilized. If there are N data nodes in the cluster and the replication factor is X, then N/X shards are created in each shard group, discarding any fractions.
|
||||
|
||||
This means that a new shard group gets created for each day of data that gets written in. Within each shard group 2 shards are created. Because of the replication factor of 2, each of those two shards are copied on 2 servers. For example we have a shard group for `2016-09-19` that has two shards `1` and `2`. Shard `1` is replicated to servers `A` and `B` while shard `2` is copied to servers `C` and `D`.
|
||||
|
||||
When a write comes in with values that have a timestamp in `2016-09-19` the cluster must first determine which shard within the shard group should receive the write. This is done by taking a hash of the `measurement` + sorted `tagset` (the metaseries) and bucketing into the correct shard. In Go this looks like:
|
||||
|
||||
```go
|
||||
// key is measurement + tagset
|
||||
// shardGroup is the group for the values based on timestamp
|
||||
// hash with fnv and then bucket
|
||||
shard := shardGroup.shards[fnv.New64a(key) % len(shardGroup.Shards)]
|
||||
```
|
||||
|
||||
There are multiple implications to this scheme for determining where data lives in a cluster. First, for any given metaseries all data on any given day exists in a single shard, and thus only on those servers hosting a copy of that shard. Second, once a shard group is created, adding new servers to the cluster won't scale out write capacity for that shard group. The replication is fixed when the shard group is created.
|
||||
|
||||
However, there is a method for expanding writes in the current shard group (i.e. today) when growing a cluster. The current shard group can be truncated to stop at the current time using `influxd-ctl truncate-shards`. This immediately closes the current shard group, forcing a new shard group to be created. That new shard group inherits the latest retention policy and data node changes and then copies itself appropriately to the newly available data nodes. Run `influxd-ctl truncate-shards help` for more information on the command.
|
||||
|
||||
### Write consistency
|
||||
|
||||
Each request to the HTTP API can specify the consistency level via the `consistency` query parameter. For this example let's assume that an HTTP write is being sent to server `D` and the data belongs in shard `1`. The write needs to be replicated to the owners of shard `1`: data nodes `A` and `B`. When the write comes into `D`, that node determines from its local cache of the metastore that the write needs to be replicated to the `A` and `B`, and it immediately tries to write to both. The subsequent behavior depends on the consistency level chosen:
|
||||
|
||||
* `any` - return success to the client as soon as any node has responded with a write success, or the receiving node has written the data to its hinted handoff queue. In our example, if `A` or `B` return a successful write response to `D`, or if `D` has cached the write in its local hinted handoff, `D` returns a write success to the client.
|
||||
* `one` - return success to the client as soon as any node has responded with a write success, but not if the write is only in hinted handoff. In our example, if `A` or `B` return a successful write response to `D`, `D` returns a write success to the client. If `D` could not send the data to either `A` or `B` but instead put the data in hinted handoff, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
|
||||
* `quorum` - return success when a majority of nodes return success. This option is only useful if the replication factor is greater than 2, otherwise it is equivalent to `all`. In our example, if both `A` and `B` return a successful write response to `D`, `D` returns a write success to the client. If either `A` or `B` does not return success, then a majority of nodes have not successfully persisted the write and `D` returns a write failure to the client. If we assume for a moment the data were bound for three nodes, `A`, `B`, and `C`, then if any two of those nodes respond with a write success, `D` returns a write success to the client. If one or fewer nodes respond with a success, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
|
||||
* `all` - return success only when all nodes return success. In our example, if both `A` and `B` return a successful write response to `D`, `D` returns a write success to the client. If either `A` or `B` does not return success, then `D` returns a write failure to the client. If we again assume three destination nodes `A`, `B`, and `C`, then all if three nodes respond with a write success, `D` returns a write success to the client. Otherwise, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
|
||||
|
||||
The important thing to note is how failures are handled. In the case of failures, the database uses the hinted handoff system.
|
||||
|
||||
### Hinted handoff
|
||||
|
||||
Hinted handoff is how InfluxDB Enterprise deals with data node outages while writes are happening. Hinted handoff is essentially a durable disk based queue. When writing at `any`, `one` or `quorum` consistency, hinted handoff is used when one or more replicas return an error after a success has already been returned to the client. When writing at `all` consistency, writes cannot return success unless all nodes return success. Temporarily stalled or failed writes may still go to the hinted handoff queues but the cluster would have already returned a failure response to the write. The receiving node creates a separate queue on disk for each data node (and shard) it cannot reach.
|
||||
|
||||
Let's again use the example of a write coming to `D` that should go to shard `1` on `A` and `B`. If we specified a consistency level of `one` and node `A` returns success, `D` immediately returns success to the client even though the write to `B` is still in progress.
|
||||
|
||||
Now let's assume that `B` returns an error. Node `D` then puts the write into its hinted handoff queue for shard `1` on node `B`. In the background, node `D` continues to attempt to empty the hinted handoff queue by writing the data to node `B`. The configuration file has settings for the maximum size and age of data in hinted handoff queues.
|
||||
|
||||
If a data node is restarted it checks for pending writes in the hinted handoff queues and resume attempts to replicate the writes. The important thing to note is that the hinted handoff queue is durable and does survive a process restart.
|
||||
|
||||
When restarting nodes within an active cluster, during upgrades or maintenance, for example, other nodes in the cluster store hinted handoff writes to the offline node and replicates them when the node is again available. Thus, a healthy cluster should have enough resource headroom on each data node to handle the burst of hinted handoff writes following a node outage. The returning node needs to handle both the steady state traffic and the queued hinted handoff writes from other nodes, meaning its write traffic will have a significant spike following any outage of more than a few seconds, until the hinted handoff queue drains.
|
||||
|
||||
If a node with pending hinted handoff writes for another data node receives a write destined for that node, it adds the write to the end of the hinted handoff queue rather than attempt a direct write. This ensures that data nodes receive data in mostly chronological order, as well as preventing unnecessary connection attempts while the other node is offline.
|
||||
|
||||
## Queries in a cluster
|
||||
|
||||
Queries in a cluster are distributed based on the time range being queried and the replication factor of the data. For example if the retention policy has a replication factor of 4, the coordinating data node receiving the query randomly picks any of the 4 data nodes that store a replica of the shard(s) to receive the query. If we assume that the system has shard durations of one day, then for each day of time covered by a query the coordinating node selects one data node to receive the query for that day.
|
||||
|
||||
The coordinating node executes and fulfill the query locally whenever possible. If a query must scan multiple shard groups (multiple days in the example above), the coordinating node forwards queries to other nodes for shards it does not have locally. The queries are forwarded in parallel to scanning its own local data. The queries are distributed to as many nodes as required to query each shard group once. As the results come back from each data node, the coordinating data node combines them into the final result that gets returned to the user.
|
|
@ -0,0 +1,219 @@
|
|||
---
|
||||
title: Compare InfluxDB to SQL databases
|
||||
description: Differences between InfluxDB and SQL databases.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Compare InfluxDB to SQL databases
|
||||
weight: 30
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
InfluxDB is similar to a SQL database, but different in many ways.
|
||||
InfluxDB is purpose-built for time series data.
|
||||
Relational databases _can_ handle time series data, but are not optimized for common time series workloads.
|
||||
InfluxDB is designed to store large volumes of time series data and quickly perform real-time analysis on that data.
|
||||
|
||||
### Timing is everything
|
||||
|
||||
In InfluxDB, a timestamp identifies a single point in any given data series.
|
||||
This is like an SQL database table where the primary key is pre-set by the system and is always time.
|
||||
|
||||
InfluxDB also recognizes that your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) preferences may change over time.
|
||||
In InfluxDB you don't have to define schemas up front.
|
||||
Data points can have one of the fields on a measurement, all of the fields on a measurement, or any number in-between.
|
||||
You can add new fields to a measurement simply by writing a point for that new field.
|
||||
If you need an explanation of the terms measurements, tags, and fields check out the next section for an SQL database to InfluxDB terminology crosswalk.
|
||||
|
||||
## Terminology
|
||||
|
||||
The table below is a (very) simple example of a table called `foodships` in an SQL database
|
||||
with the unindexed column `#_foodships` and the indexed columns `park_id`, `planet`, and `time`.
|
||||
|
||||
``` sql
|
||||
+---------+---------+---------------------+--------------+
|
||||
| park_id | planet | time | #_foodships |
|
||||
+---------+---------+---------------------+--------------+
|
||||
| 1 | Earth | 1429185600000000000 | 0 |
|
||||
| 1 | Earth | 1429185601000000000 | 3 |
|
||||
| 1 | Earth | 1429185602000000000 | 15 |
|
||||
| 1 | Earth | 1429185603000000000 | 15 |
|
||||
| 2 | Saturn | 1429185600000000000 | 5 |
|
||||
| 2 | Saturn | 1429185601000000000 | 9 |
|
||||
| 2 | Saturn | 1429185602000000000 | 10 |
|
||||
| 2 | Saturn | 1429185603000000000 | 14 |
|
||||
| 3 | Jupiter | 1429185600000000000 | 20 |
|
||||
| 3 | Jupiter | 1429185601000000000 | 21 |
|
||||
| 3 | Jupiter | 1429185602000000000 | 21 |
|
||||
| 3 | Jupiter | 1429185603000000000 | 20 |
|
||||
| 4 | Saturn | 1429185600000000000 | 5 |
|
||||
| 4 | Saturn | 1429185601000000000 | 5 |
|
||||
| 4 | Saturn | 1429185602000000000 | 6 |
|
||||
| 4 | Saturn | 1429185603000000000 | 5 |
|
||||
+---------+---------+---------------------+--------------+
|
||||
```
|
||||
|
||||
Those same data look like this in InfluxDB:
|
||||
|
||||
```sql
|
||||
name: foodships
|
||||
tags: park_id=1, planet=Earth
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 0
|
||||
2015-04-16T12:00:01Z 3
|
||||
2015-04-16T12:00:02Z 15
|
||||
2015-04-16T12:00:03Z 15
|
||||
|
||||
name: foodships
|
||||
tags: park_id=2, planet=Saturn
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 5
|
||||
2015-04-16T12:00:01Z 9
|
||||
2015-04-16T12:00:02Z 10
|
||||
2015-04-16T12:00:03Z 14
|
||||
|
||||
name: foodships
|
||||
tags: park_id=3, planet=Jupiter
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 20
|
||||
2015-04-16T12:00:01Z 21
|
||||
2015-04-16T12:00:02Z 21
|
||||
2015-04-16T12:00:03Z 20
|
||||
|
||||
name: foodships
|
||||
tags: park_id=4, planet=Saturn
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 5
|
||||
2015-04-16T12:00:01Z 5
|
||||
2015-04-16T12:00:02Z 6
|
||||
2015-04-16T12:00:03Z 5
|
||||
```
|
||||
|
||||
Referencing the example above, in general:
|
||||
|
||||
* An InfluxDB measurement (`foodships`) is similar to an SQL database table.
|
||||
* InfluxDB tags ( `park_id` and `planet`) are like indexed columns in an SQL database.
|
||||
* InfluxDB fields (`#_foodships`) are like unindexed columns in an SQL database.
|
||||
* InfluxDB points (for example, `2015-04-16T12:00:00Z 5`) are similar to SQL rows.
|
||||
|
||||
Building on this comparison of database terminology,
|
||||
InfluxDB [continuous queries](/enterprise_influxdb/v1.10/concepts/glossary/#continuous-query-cq)
|
||||
and [retention policies](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp) are
|
||||
similar to stored procedures in an SQL database.
|
||||
They're specified once and then performed regularly and automatically.
|
||||
|
||||
Of course, there are some major disparities between SQL databases and InfluxDB.
|
||||
SQL `JOIN`s aren't available for InfluxDB measurements; your schema design should reflect that difference.
|
||||
And, as we mentioned above, a measurement is like an SQL table where the primary index is always pre-set to time.
|
||||
InfluxDB timestamps must be in UNIX epoch (GMT) or formatted as a date-time string valid under RFC3339.
|
||||
|
||||
For more detailed descriptions of the InfluxDB terms mentioned in this section see our [Glossary of Terms](/enterprise_influxdb/v1.10/concepts/glossary/).
|
||||
|
||||
## Query languages
|
||||
InfluxDB supports multiple query languages:
|
||||
|
||||
- [Flux](#flux)
|
||||
- [InfluxQL](#influxql)
|
||||
|
||||
### Flux
|
||||
|
||||
[Flux](/enterprise_influxdb/v1.10/flux/) is a data scripting language designed for querying, analyzing, and acting on time series data.
|
||||
Beginning with **InfluxDB 1.8.0**, Flux is available for production use along side InfluxQL.
|
||||
|
||||
For those familiar with [InfluxQL](#influxql), Flux is intended to address
|
||||
many of the outstanding feature requests that we've received since introducing InfluxDB 1.0.
|
||||
For a comparison between Flux and InfluxQL, see [Flux vs InfluxQL](/enterprise_influxdb/v1.10/flux/flux-vs-influxql/).
|
||||
|
||||
Flux is the primary language for working with data in [InfluxDB OSS 2.0](/influxdb/v2.0/get-started)
|
||||
and [InfluxDB Cloud](/influxdb/cloud/get-started/),
|
||||
a generally available Platform as a Service (PaaS) available across multiple Cloud Service Providers.
|
||||
Using Flux with InfluxDB 1.8+ lets you get familiar with Flux concepts and syntax
|
||||
and ease the transition to InfluxDB 2.0.
|
||||
|
||||
### InfluxQL
|
||||
|
||||
InfluxQL is an SQL-like query language for interacting with InfluxDB.
|
||||
It has been crafted to feel familiar to those coming from other
|
||||
SQL or SQL-like environments while also providing features specific
|
||||
to storing and analyzing time series data.
|
||||
However **InfluxQL is not SQL** and lacks support for more advanced operations
|
||||
like `UNION`, `JOIN` and `HAVING` that SQL power-users are accustomed to.
|
||||
This functionality is available with [Flux](/flux/latest/introduction).
|
||||
|
||||
InfluxQL's `SELECT` statement follows the form of an SQL `SELECT` statement:
|
||||
|
||||
```sql
|
||||
SELECT <stuff> FROM <measurement_name> WHERE <some_conditions>
|
||||
```
|
||||
|
||||
where `WHERE` is optional.
|
||||
|
||||
To get the InfluxDB output in the section above, you'd enter:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships"
|
||||
```
|
||||
|
||||
If you only wanted to see data for the planet `Saturn`, you'd enter:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships" WHERE "planet" = 'Saturn'
|
||||
```
|
||||
|
||||
If you wanted to see data for the planet `Saturn` after 12:00:01 UTC on April 16, 2015, you'd enter:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships" WHERE "planet" = 'Saturn' AND time > '2015-04-16 12:00:01'
|
||||
```
|
||||
|
||||
As shown in the example above, InfluxQL allows you to specify the time range of your query in the `WHERE` clause.
|
||||
You can use date-time strings wrapped in single quotes that have the
|
||||
format `YYYY-MM-DD HH:MM:SS.mmm`
|
||||
(`mmm` is milliseconds and is optional, and you can also specify microseconds or nanoseconds).
|
||||
You can also use relative time with `now()` which refers to the server's current timestamp:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships" WHERE time > now() - 1h
|
||||
```
|
||||
|
||||
That query outputs the data in the `foodships` measure where the timestamp is newer than the server's current time minus one hour.
|
||||
The options for specifying time durations with `now()` are:
|
||||
|
||||
|Letter|Meaning|
|
||||
|:---:|:---:|
|
||||
| ns | nanoseconds |
|
||||
|u or µ|microseconds|
|
||||
| ms | milliseconds |
|
||||
|s | seconds |
|
||||
| m | minutes |
|
||||
| h | hours |
|
||||
| d | days |
|
||||
| w | weeks |
|
||||
|
||||
InfluxQL also supports regular expressions, arithmetic in expressions, `SHOW` statements, and `GROUP BY` statements.
|
||||
See our [data exploration](/enterprise_influxdb/v1.10/query_language/explore-data/) page for an in-depth discussion of those topics.
|
||||
InfluxQL functions include `COUNT`, `MIN`, `MAX`, `MEDIAN`, `DERIVATIVE` and more.
|
||||
For a full list check out the [functions](/enterprise_influxdb/v1.10/query_language/functions/) page.
|
||||
|
||||
Now that you have the general idea, check out our [Getting Started Guide](/enterprise_influxdb/v1.10/introduction/getting-started/).
|
||||
|
||||
## InfluxDB is not CRUD
|
||||
|
||||
InfluxDB is a database that has been optimized for time series data.
|
||||
This data commonly comes from sources like distributed sensor groups, click data from large websites, or lists of financial transactions.
|
||||
|
||||
One thing this data has in common is that it is more useful in the aggregate.
|
||||
One reading saying that your computer’s CPU is at 12% utilization at 12:38:35 UTC on a Tuesday is hard to draw conclusions from.
|
||||
It becomes more useful when combined with the rest of the series and visualized.
|
||||
This is where trends over time begin to show, and actionable insight can be drawn from the data.
|
||||
In addition, time series data is generally written once and rarely updated.
|
||||
|
||||
The result is that InfluxDB is not a full CRUD database but more like a CR-ud, prioritizing the performance of creating and reading data over update and destroy, and [preventing some update and destroy behaviors](/enterprise_influxdb/v1.10/concepts/insights_tradeoffs/) to make create and read more performant:
|
||||
|
||||
* To update a point, insert one with [the same measurement, tag set, and timestamp](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points).
|
||||
* You can [drop or delete a series](/enterprise_influxdb/v1.10/query_language/manage-database/#drop-series-from-the-index-with-drop-series), but not individual points based on field values. As a workaround, you can search for the field value, retrieve the time, then [DELETE based on the `time` field](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-series-with-delete).
|
||||
* You can't update or rename tags yet - see GitHub issue [#4157](https://github.com/influxdata/influxdb/issues/4157) for more information. To modify the tag of a series of points, find the points with the offending tag value, change the value to the desired one, write the points back, then drop the series with the old tag value.
|
||||
* You can't delete tags by tag key (as opposed to value) - see GitHub issue [#8604](https://github.com/influxdata/influxdb/issues/8604).
|
|
@ -0,0 +1,110 @@
|
|||
---
|
||||
title: InfluxDB file system layout
|
||||
description: >
|
||||
The InfluxDB Enterprise file system layout depends on the operating system, package manager,
|
||||
or containerization platform used to install InfluxDB.
|
||||
weight: 102
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: File system layout
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
The InfluxDB Enterprise file system layout depends on the installation method
|
||||
or containerization platform used to install InfluxDB Enterprise.
|
||||
|
||||
- [InfluxDB Enterprise file structure](#influxdb-enterprise-file-structure)
|
||||
- [File system layout](#file-system-layout)
|
||||
|
||||
## InfluxDB Enterprise file structure
|
||||
The InfluxDB file structure includes the following:
|
||||
|
||||
- [Data directory](#data-directory)
|
||||
- [WAL directory](#wal-directory)
|
||||
- [Metastore directory](#metastore-directory)
|
||||
- [Hinted handoff directory](#hinted-handoff-directory)
|
||||
- [InfluxDB Enterprise configuration files](#influxdb-enterprise-configuration-files)
|
||||
|
||||
### Data directory
|
||||
(**Data nodes only**)
|
||||
Directory path where InfluxDB Enterprise stores time series data (TSM files).
|
||||
To customize this path, use the [`[data].dir`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#dir)
|
||||
configuration option.
|
||||
|
||||
### WAL directory
|
||||
(**Data nodes only**)
|
||||
Directory path where InfluxDB Enterprise stores Write Ahead Log (WAL) files.
|
||||
To customize this path, use the [`[data].wal-dir`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#wal-dir)
|
||||
configuration option.
|
||||
|
||||
### Hinted handoff directory
|
||||
(**Data nodes only**)
|
||||
Directory path where hinted handoff (HH) queues are stored.
|
||||
To customize this path, use the [`[hinted-handoff].dir`](/enterprise_influxdb/v1.10/administration/config-data-nodes/#dir)
|
||||
configuration option.
|
||||
|
||||
### Metastore directory
|
||||
Directory path of the InfluxDB Enterprise metastore, which stores information
|
||||
about the cluster, users, databases, retention policies, shards, and continuous queries.
|
||||
|
||||
**On data nodes**, the metastore contains information about InfluxDB Enterprise meta nodes.
|
||||
To customize this path, use the [`[meta].dir` configuration option in your data node configuration file](/enterprise_influxdb/v1.10/administration/config-data-nodes/#dir).
|
||||
|
||||
**On meta nodes**, the metastore contains information about the InfluxDB Enterprise RAFT cluster.
|
||||
To customize this path, use the [`[meta].dir` configuration option in your meta node configuration file](/enterprise_influxdb/v1.10/administration/config-meta-nodes/#dir).
|
||||
|
||||
### InfluxDB Enterprise configuration files
|
||||
InfluxDB Enterprise stores default data and meta node configuration file on disk.
|
||||
For more information about using InfluxDB Enterprise configuration files, see:
|
||||
|
||||
- [Configure data nodes](/enterprise_influxdb/v1.10/administration/config-data-nodes/)
|
||||
- [Configure meta nodes](/enterprise_influxdb/v1.10/administration/config-meta-nodes/)
|
||||
|
||||
## File system layout
|
||||
InfluxDB Enterprise supports **.deb-** and **.rpm-based** Linux package managers.
|
||||
The file system layout is the same with each.
|
||||
|
||||
- [Data node file system layout](#data-node-file-system-layout)
|
||||
- [Meta node file system layout](#meta-node-file-system-layout)
|
||||
|
||||
### Data node file system layout
|
||||
| Path | Default |
|
||||
| :------------------------------------------------------------------- | :---------------------------- |
|
||||
| [Data directory](#data-directory) | `/var/lib/influxdb/data/` |
|
||||
| [WAL directory](#wal-directory) | `/var/lib/influxdb/wal/` |
|
||||
| [Metastore directory](#metastore-directory) | `/var/lib/influxdb/meta/` |
|
||||
| [Hinted handoff directory](#hinted-handoff-directory) | `/var/lib/influxdb/hh/` |
|
||||
| [Default config file path](#influxdb-enterprise-configuration-files) | `/etc/influxdb/influxdb.conf` |
|
||||
|
||||
##### Data node file system overview
|
||||
{{% filesystem-diagram %}}
|
||||
- /etc/influxdb/
|
||||
- influxdb.conf _<span style="opacity:.4">(Data node configuration file)</span>_
|
||||
- /var/lib/influxdb/
|
||||
- data/
|
||||
- _<span style="opacity:.4">TSM directories and files</span>_
|
||||
- hh/
|
||||
- _<span style="opacity:.4">HH queue files</span>_
|
||||
- meta/
|
||||
- client.json
|
||||
- wal/
|
||||
- _<span style="opacity:.4">WAL directories and files</span>_
|
||||
{{% /filesystem-diagram %}}
|
||||
|
||||
### Meta node file system layout
|
||||
| Path | Default |
|
||||
| :------------------------------------------------------------------- | :--------------------------------- |
|
||||
| [Metastore directory](#metastore-directory) | `/var/lib/influxdb/meta/` |
|
||||
| [Default config file path](#influxdb-enterprise-configuration-files) | `/etc/influxdb/influxdb-meta.conf` |
|
||||
|
||||
##### Meta node file system overview
|
||||
{{% filesystem-diagram %}}
|
||||
- /etc/influxdb/
|
||||
- influxdb-meta.conf _<span style="opacity:.4">(Meta node configuration file)</span>_
|
||||
- /var/lib/influxdb/
|
||||
- meta/
|
||||
- peers.json
|
||||
- raft.db
|
||||
- snapshots/
|
||||
- _<span style="opacity:.4">Snapshot directories and files</span>_
|
||||
{{% /filesystem-diagram %}}
|
|
@ -0,0 +1,456 @@
|
|||
---
|
||||
title: Glossary
|
||||
description: Terms related to InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/concepts/glossary/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 20
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
## aggregation
|
||||
|
||||
An InfluxQL function that returns an aggregated value across a set of points.
|
||||
For a complete list of the available and upcoming aggregations,
|
||||
see [InfluxQL functions](/enterprise_influxdb/v1.10/query_language/functions/#aggregations).
|
||||
|
||||
Related entries: [function](#function), [selector](#selector), [transformation](#transformation)
|
||||
|
||||
## batch
|
||||
|
||||
A collection of data points in InfluxDB line protocol format, separated by newlines (`0x0A`).
|
||||
A batch of points may be submitted to the database using a single HTTP request to the write endpoint.
|
||||
This makes writes using the InfluxDB API much more performant by drastically reducing the HTTP overhead.
|
||||
InfluxData recommends batch sizes of 5,000-10,000 points, although different use cases may be better served by significantly smaller or larger batches.
|
||||
|
||||
Related entries: [InfluxDB line protocol](#influxdb-line-protocol), [point](#point)
|
||||
|
||||
## bucket
|
||||
|
||||
A bucket is a named location where time series data is stored in **InfluxDB 2.0**. In InfluxDB 1.8+, each combination of a database and a retention policy (database/retention-policy) represents a bucket. Use the [InfluxDB 2.0 API compatibility endpoints](/enterprise_influxdb/v1.10/tools/api#influxdb-2-0-api-compatibility-endpoints) included with InfluxDB 1.8+ to interact with buckets.
|
||||
|
||||
## continuous query (CQ)
|
||||
|
||||
An InfluxQL query that runs automatically and periodically within a database.
|
||||
Continuous queries require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
|
||||
See [Continuous Queries](/enterprise_influxdb/v1.10/query_language/continuous_queries/).
|
||||
|
||||
|
||||
Related entries: [function](#function)
|
||||
|
||||
## data node
|
||||
|
||||
A node that runs the data service.
|
||||
|
||||
For high availability, installations must have at least two data nodes.
|
||||
The number of data nodes in your cluster must be the same as your highest
|
||||
replication factor.
|
||||
Any replication factor greater than two gives you additional fault tolerance and
|
||||
query capacity within the cluster.
|
||||
|
||||
Data node sizes will depend on your needs.
|
||||
The Amazon EC2 m4.large or m4.xlarge are good starting points.
|
||||
|
||||
Related entries: [data service](#data-service), [replication factor](#replication-factor)
|
||||
|
||||
## data service
|
||||
|
||||
Stores all time series data and handles all writes and queries.
|
||||
|
||||
Related entries: [data node](#data-node)
|
||||
|
||||
## database
|
||||
|
||||
A logical container for users, retention policies, continuous queries, and time series data.
|
||||
|
||||
Related entries: [continuous query](#continuous-query-cq), [retention policy](#retention-policy-rp), [user](#user)
|
||||
|
||||
## duration
|
||||
|
||||
The attribute of the retention policy that determines how long InfluxDB stores data.
|
||||
Data older than the duration are automatically dropped from the database.
|
||||
See [Database Management](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy) for how to set duration.
|
||||
|
||||
Related entries: [retention policy](#retention-policy-rp)
|
||||
|
||||
## field
|
||||
|
||||
The key-value pair in an InfluxDB data structure that records metadata and the actual data value.
|
||||
Fields are required in InfluxDB data structures and they are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant relative to tags.
|
||||
|
||||
*Query tip:* Compare fields to tags; tags are indexed.
|
||||
|
||||
Related entries: [field key](#field-key), [field set](#field-set), [field value](#field-value), [tag](#tag)
|
||||
|
||||
## field key
|
||||
|
||||
The key part of the key-value pair that makes up a field.
|
||||
Field keys are strings and they store metadata.
|
||||
|
||||
Related entries: [field](#field), [field set](#field-set), [field value](#field-value), [tag key](#tag-key)
|
||||
|
||||
## field set
|
||||
|
||||
The collection of field keys and field values on a point.
|
||||
|
||||
Related entries: [field](#field), [field key](#field-key), [field value](#field-value), [point](#point)
|
||||
|
||||
## field value
|
||||
|
||||
The value part of the key-value pair that makes up a field.
|
||||
Field values are the actual data; they can be strings, floats, integers, or booleans.
|
||||
A field value is always associated with a timestamp.
|
||||
|
||||
Field values are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant.
|
||||
|
||||
*Query tip:* Compare field values to tag values; tag values are indexed.
|
||||
|
||||
Related entries: [field](#field), [field key](#field-key), [field set](#field-set), [tag value](#tag-value), [timestamp](#timestamp)
|
||||
|
||||
## function
|
||||
|
||||
InfluxQL aggregations, selectors, and transformations.
|
||||
See [InfluxQL Functions](/enterprise_influxdb/v1.10/query_language/functions/) for a complete list of InfluxQL functions.
|
||||
|
||||
Related entries: [aggregation](#aggregation), [selector](#selector), [transformation](#transformation)
|
||||
|
||||
<!--
|
||||
## grant
|
||||
-->
|
||||
## identifier
|
||||
|
||||
Tokens that refer to continuous query names, database names, field keys,
|
||||
measurement names, retention policy names, subscription names, tag keys, and
|
||||
user names.
|
||||
See [Query Language Specification](/enterprise_influxdb/v1.10/query_language/spec/#identifiers).
|
||||
|
||||
Related entries:
|
||||
[database](#database),
|
||||
[field key](#field-key),
|
||||
[measurement](#measurement),
|
||||
[retention policy](#retention-policy-rp),
|
||||
[tag key](#tag-key),
|
||||
[user](#user)
|
||||
|
||||
## InfluxDB line protocol
|
||||
|
||||
The text based format for writing points to InfluxDB. See [InfluxDB line protocol](/enterprise_influxdb/v1.10/write_protocols/).
|
||||
|
||||
## measurement
|
||||
|
||||
The part of the InfluxDB data structure that describes the data stored in the associated fields.
|
||||
Measurements are strings.
|
||||
|
||||
Related entries: [field](#field), [series](#series)
|
||||
|
||||
## meta node
|
||||
|
||||
A node that runs the meta service.
|
||||
|
||||
For high availability, installations must have three meta nodes.
|
||||
Meta nodes can be very modestly sized instances like an EC2 t2.micro or even a
|
||||
nano.
|
||||
For additional fault tolerance installations may use five meta nodes; the
|
||||
number of meta nodes must be an odd number.
|
||||
|
||||
Related entries: [meta service](#meta-service)
|
||||
|
||||
## meta service
|
||||
|
||||
The consistent data store that keeps state about the cluster, including which
|
||||
servers, databases, users, continuous queries, retention policies, subscriptions,
|
||||
and blocks of time exist.
|
||||
|
||||
Related entries: [meta node](#meta-node)
|
||||
|
||||
## metastore
|
||||
|
||||
Contains internal information about the status of the system.
|
||||
The metastore contains the user information, databases, retention policies, shard metadata, continuous queries, and subscriptions.
|
||||
|
||||
Related entries: [database](#database), [retention policy](#retention-policy-rp), [user](#user)
|
||||
|
||||
## node
|
||||
|
||||
An independent `influxd` process.
|
||||
|
||||
Related entries: [server](#server)
|
||||
|
||||
## now()
|
||||
|
||||
The local server's nanosecond timestamp.
|
||||
|
||||
## passive node (experimental)
|
||||
|
||||
Passive nodes act as load balancers--they accept write calls, perform shard lookup and RPC calls (on active data nodes), and distribute writes to active data nodes. They do not own shards or accept writes.
|
||||
**Note:** This is an experimental feature.
|
||||
|
||||
<!--
|
||||
## permission
|
||||
-->
|
||||
|
||||
## point
|
||||
|
||||
In InfluxDB, a point represents a single data record, similar to a row in a SQL database table. Each point:
|
||||
|
||||
- has a measurement, a tag set, a field key, a field value, and a timestamp;
|
||||
- is uniquely identified by its series and timestamp.
|
||||
|
||||
You cannot store more than one point with the same timestamp in a series.
|
||||
If you write a point to a series with a timestamp that matches an existing point, the field set becomes a union of the old and new field set, and any ties go to the new field set.
|
||||
For more information about duplicate points, see [How does InfluxDB handle duplicate points?](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)
|
||||
|
||||
Related entries: [field set](#field-set), [series](#series), [timestamp](#timestamp)
|
||||
|
||||
## points per second
|
||||
|
||||
A deprecated measurement of the rate at which data are persisted to InfluxDB.
|
||||
The schema allows and even encourages the recording of multiple metric values per point, rendering points per second ambiguous.
|
||||
|
||||
Write speeds are generally quoted in values per second, a more precise metric.
|
||||
|
||||
Related entries: [point](#point), [schema](#schema), [values per second](#values-per-second)
|
||||
|
||||
## query
|
||||
|
||||
An operation that retrieves data from InfluxDB.
|
||||
See [Data Exploration](/enterprise_influxdb/v1.10/query_language/explore-data/), [Schema Exploration](/enterprise_influxdb/v1.10/query_language/explore-schema/), [Database Management](/enterprise_influxdb/v1.10/query_language/manage-database/).
|
||||
|
||||
## replication factor (RF)
|
||||
|
||||
The attribute of the retention policy that determines how many copies of the
|
||||
data are stored in the cluster. Replicating copies ensures that data is accessible when one or more data nodes are unavailable.
|
||||
InfluxDB replicates data across `N` data nodes, where `N` is the replication
|
||||
factor.
|
||||
|
||||
To maintain data availability for queries, the replication factor should be less
|
||||
than or equal to the number of data nodes in the cluster:
|
||||
|
||||
* Data is fully available when the replication factor is greater than the
|
||||
number of unavailable data nodes.
|
||||
* Data may be unavailable when the replication factor is less than the number of
|
||||
unavailable data nodes.
|
||||
|
||||
Any replication factor greater than two gives you additional fault tolerance and
|
||||
query capacity within the cluster.
|
||||
|
||||
Related entries: [duration](#duration), [node](#node),
|
||||
[retention policy](#retention-policy-rp)
|
||||
|
||||
## retention policy (RP)
|
||||
|
||||
Describes how long InfluxDB keeps data (duration), how many copies of the data to store in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.
|
||||
|
||||
When you create a database, InfluxDB creates a retention policy called `autogen` with an infinite duration, a replication factor set to one, and a shard group duration set to seven days.
|
||||
For more information, see [Retention policy management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
Related entries: [duration](#duration), [measurement](#measurement), [replication factor](#replication-factor), [series](#series), [shard duration](#shard-duration), [tag set](#tag-set)
|
||||
|
||||
<!--
|
||||
## role
|
||||
-->
|
||||
## schema
|
||||
|
||||
How the data are organized in InfluxDB.
|
||||
The fundamentals of the InfluxDB schema are databases, retention policies, series, measurements, tag keys, tag values, and field keys.
|
||||
See [Schema Design](/enterprise_influxdb/v1.10/concepts/schema_and_data_layout/) for more information.
|
||||
|
||||
Related entries: [database](#database), [field key](#field-key), [measurement](#measurement), [retention policy](#retention-policy-rp), [series](#series), [tag key](#tag-key), [tag value](#tag-value)
|
||||
|
||||
## selector
|
||||
|
||||
An InfluxQL function that returns a single point from the range of specified points.
|
||||
See [InfluxQL Functions](/enterprise_influxdb/v1.10/query_language/functions/#selectors) for a complete list of the available and upcoming selectors.
|
||||
|
||||
Related entries: [aggregation](#aggregation), [function](#function), [transformation](#transformation)
|
||||
|
||||
## series
|
||||
|
||||
A logical grouping of data defined by shared measurement, tag set, and field key.
|
||||
|
||||
Related entries: [field set](#field-set), [measurement](#measurement), [tag set](#tag-set)
|
||||
|
||||
## series cardinality
|
||||
|
||||
The number of unique database, measurement, tag set, and field key combinations in an InfluxDB instance.
|
||||
|
||||
For example, assume that an InfluxDB instance has a single database and one measurement.
|
||||
The single measurement has two tag keys: `email` and `status`.
|
||||
If there are three different `email`s, and each email address is associated with two
|
||||
different `status`es then the series cardinality for the measurement is 6
|
||||
(3 * 2 = 6):
|
||||
|
||||
| email | status |
|
||||
| :-------------------- | :----- |
|
||||
| lorr@influxdata.com | start |
|
||||
| lorr@influxdata.com | finish |
|
||||
| marv@influxdata.com | start |
|
||||
| marv@influxdata.com | finish |
|
||||
| cliff@influxdata.com | start |
|
||||
| cliff@influxdata.com | finish |
|
||||
|
||||
Note that, in some cases, simply performing that multiplication may overestimate series cardinality because of the presence of dependent tags.
|
||||
Dependent tags are tags that are scoped by another tag and do not increase series
|
||||
cardinality.
|
||||
If we add the tag `firstname` to the example above, the series cardinality
|
||||
would not be 18 (3 * 2 * 3 = 18).
|
||||
It would remain unchanged at 6, as `firstname` is already scoped by the `email` tag:
|
||||
|
||||
| email | status | firstname |
|
||||
| :-------------------- | :----- | :-------- |
|
||||
| lorr@influxdata.com | start | lorraine |
|
||||
| lorr@influxdata.com | finish | lorraine |
|
||||
| marv@influxdata.com | start | marvin |
|
||||
| marv@influxdata.com | finish | marvin |
|
||||
| cliff@influxdata.com | start | clifford |
|
||||
| cliff@influxdata.com | finish | clifford |
|
||||
|
||||
See [SHOW CARDINALITY](/enterprise_influxdb/v1.10/query_language/spec/#show-cardinality) to learn about the InfluxQL commands for series cardinality.
|
||||
|
||||
Related entries: [field key](#field-key),[measurement](#measurement), [tag key](#tag-key), [tag set](#tag-set)
|
||||
|
||||
## series key
|
||||
|
||||
A series key identifies a particular series by measurement, tag set, and field key.
|
||||
|
||||
For example:
|
||||
|
||||
```
|
||||
# measurement, tag set, field key
|
||||
h2o_level, location=santa_monica, h2o_feet
|
||||
```
|
||||
|
||||
Related entries: [series](#series)
|
||||
|
||||
## server
|
||||
|
||||
A machine, virtual or physical, that is running InfluxDB.
|
||||
There should only be one InfluxDB process per server.
|
||||
|
||||
Related entries: [node](#node)
|
||||
|
||||
## shard
|
||||
|
||||
A shard contains the actual encoded and compressed data, and is represented by a TSM file on disk.
|
||||
Every shard belongs to one and only one shard group.
|
||||
Multiple shards may exist in a single shard group.
|
||||
Each shard contains a specific set of series.
|
||||
All points falling on a given series in a given shard group will be stored in the same shard (TSM file) on disk.
|
||||
|
||||
Related entries: [series](#series), [shard duration](#shard-duration), [shard group](#shard-group), [tsm](#tsm-time-structured-merge-tree)
|
||||
|
||||
## shard duration
|
||||
|
||||
The shard duration determines how much time each shard group spans.
|
||||
The specific interval is determined by the `SHARD DURATION` of the retention policy.
|
||||
See [Retention Policy management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management) for more information.
|
||||
|
||||
For example, given a retention policy with `SHARD DURATION` set to `1w`, each shard group will span a single week and contain all points with timestamps in that week.
|
||||
|
||||
Related entries: [database](#database), [retention policy](#retention-policy-rp), [series](#series), [shard](#shard), [shard group](#shard-group)
|
||||
|
||||
## shard group
|
||||
|
||||
Shard groups are logical containers for shards.
|
||||
Shard groups are organized by time and retention policy.
|
||||
Every retention policy that contains data has at least one associated shard group.
|
||||
A given shard group contains all shards with data for the interval covered by the shard group.
|
||||
The interval spanned by each shard group is the shard duration.
|
||||
|
||||
Related entries: [database](#database), [retention policy](#retention-policy-rp), [series](#series), [shard](#shard), [shard duration](#shard-duration)
|
||||
|
||||
## subscription
|
||||
|
||||
Subscriptions allow [Kapacitor](/{{< latest "kapacitor" >}}/) to receive data from InfluxDB in a push model rather than the pull model based on querying data.
|
||||
When Kapacitor is configured to work with InfluxDB, the subscription will automatically push every write for the subscribed database from InfluxDB to Kapacitor.
|
||||
Subscriptions can use TCP or UDP for transmitting the writes.
|
||||
|
||||
## tag
|
||||
|
||||
The key-value pair in the InfluxDB data structure that records metadata.
|
||||
Tags are an optional part of the data structure, but they are useful for storing commonly-queried metadata; tags are indexed so queries on tags are performant.
|
||||
*Query tip:* Compare tags to fields; fields are not indexed.
|
||||
|
||||
Related entries: [field](#field), [tag key](#tag-key), [tag set](#tag-set), [tag value](#tag-value)
|
||||
|
||||
## tag key
|
||||
|
||||
The key part of the key-value pair that makes up a tag.
|
||||
Tag keys are strings and they store metadata.
|
||||
Tag keys are indexed so queries on tag keys are performant.
|
||||
|
||||
*Query tip:* Compare tag keys to field keys; field keys are not indexed.
|
||||
|
||||
Related entries: [field key](#field-key), [tag](#tag), [tag set](#tag-set), [tag value](#tag-value)
|
||||
|
||||
## tag set
|
||||
|
||||
The collection of tag keys and tag values on a point.
|
||||
|
||||
Related entries: [point](#point), [series](#series), [tag](#tag), [tag key](#tag-key), [tag value](#tag-value)
|
||||
|
||||
## tag value
|
||||
|
||||
The value part of the key-value pair that makes up a tag.
|
||||
Tag values are strings and they store metadata.
|
||||
Tag values are indexed so queries on tag values are performant.
|
||||
|
||||
|
||||
Related entries: [tag](#tag), [tag key](#tag-key), [tag set](#tag-set)
|
||||
|
||||
## timestamp
|
||||
|
||||
The date and time associated with a point.
|
||||
All time in InfluxDB is UTC.
|
||||
|
||||
For how to specify time when writing data, see [Write Syntax](/enterprise_influxdb/v1.10/write_protocols/write_syntax/).
|
||||
For how to specify time when querying data, see [Data Exploration](/enterprise_influxdb/v1.10/query_language/explore-data/#time-syntax).
|
||||
|
||||
Related entries: [point](#point)
|
||||
|
||||
## transformation
|
||||
|
||||
An InfluxQL function that returns a value or a set of values calculated from specified points, but does not return an aggregated value across those points.
|
||||
See [InfluxQL Functions](/enterprise_influxdb/v1.10/query_language/functions/#transformations) for a complete list of the available and upcoming aggregations.
|
||||
|
||||
Related entries: [aggregation](#aggregation), [function](#function), [selector](#selector)
|
||||
|
||||
## TSM (Time Structured Merge tree)
|
||||
|
||||
The purpose-built data storage format for InfluxDB. TSM allows for greater compaction and higher write and read throughput than existing B+ or LSM tree implementations. See [Storage Engine](/enterprise_influxdb/v1.10/concepts/storage_engine/) for more.
|
||||
|
||||
## user
|
||||
|
||||
There are three kinds of users in InfluxDB Enterprise:
|
||||
|
||||
* *Global admin users* have all permissions.
|
||||
* *Admin users* have `READ` and `WRITE` access to all databases and full access to administrative queries and user management commands.
|
||||
* *Non-admin users* have `READ`, `WRITE`, or `ALL` (both `READ` and `WRITE`) access per database.
|
||||
|
||||
When authentication is enabled, InfluxDB only executes HTTP requests that are sent with a valid username and password.
|
||||
See [Authentication and Authorization](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/).
|
||||
|
||||
## values per second
|
||||
|
||||
The preferred measurement of the rate at which data are persisted to InfluxDB. Write speeds are generally quoted in values per second.
|
||||
|
||||
To calculate the values per second rate, multiply the number of points written per second by the number of values stored per point. For example, if the points have four fields each, and a batch of 5000 points is written 10 times per second, then the values per second rate is `4 field values per point * 5000 points per batch * 10 batches per second = 200,000 values per second`.
|
||||
|
||||
Related entries: [batch](#batch), [field](#field), [point](#point), [points per second](#points-per-second)
|
||||
|
||||
## WAL (Write Ahead Log)
|
||||
|
||||
The temporary cache for recently written points. To reduce the frequency with which the permanent storage files are accessed, InfluxDB caches new points in the WAL until their total size or age triggers a flush to more permanent storage. This allows for efficient batching of the writes into the TSM.
|
||||
|
||||
Points in the WAL can be queried, and they persist through a system reboot. On process start, all points in the WAL must be flushed before the system accepts new writes.
|
||||
|
||||
Related entries: [tsm](#tsm-time-structured-merge-tree)
|
||||
|
||||
## web console
|
||||
|
||||
Legacy user interface for the InfluxDB Enterprise.
|
||||
|
||||
This interface has been deprecated. We recommend using [Chronograf](/{{< latest "chronograf" >}}/introduction/).
|
||||
|
||||
If you are transitioning from the Enterprise Web Console to Chronograf, see how to [transition from the InfluxDB Web Admin Interface](/chronograf/v1.7/guides/transition-web-admin-interface/).
|
|
@ -0,0 +1,85 @@
|
|||
---
|
||||
title: InfluxDB Enterprise startup process
|
||||
description: >
|
||||
On startup, InfluxDB Enterprise starts all subsystems and services in a deterministic order.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 10
|
||||
name: Startup process
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
On startup, InfluxDB Enterprise starts all subsystems and services in the following order:
|
||||
|
||||
1. [TSDBStore](#tsdbstore)
|
||||
2. [Monitor](#monitor)
|
||||
3. [Cluster](#cluster)
|
||||
4. [Precreator](#precreator)
|
||||
5. [Snapshotter](#snapshotter)
|
||||
6. [Continuous Query](#continuous-query)
|
||||
7. [Announcer](#announcer)
|
||||
8. [Retention](#retention)
|
||||
9. [Stats](#stats)
|
||||
10. [Anti-entropy](#anti-entropy)
|
||||
11. [HTTP API](#http-api)
|
||||
|
||||
A **subsystem** is a collection of related services managed together as part of a greater whole.
|
||||
A **service** is a process that provides specific functionality.
|
||||
|
||||
## Subsystems and services
|
||||
|
||||
### TSDBStore
|
||||
The TSDBStore subsystem starts and manages the TSM storage engine.
|
||||
This includes services such as the points writer (write), reads (query),
|
||||
and [hinted handoff (HH)](/enterprise_influxdb/v1.10/concepts/clustering/#hinted-handoff).
|
||||
TSDBSTore first opens all the shards and loads write-ahead log (WAL) data into the in-memory write cache.
|
||||
If `influxd` was cleanly shutdown previously, there will not be any WAL data.
|
||||
It then loads a portion of each shard's index.
|
||||
|
||||
{{% note %}}
|
||||
#### Index versions and startup times
|
||||
If using `inmem` indexing, InfluxDB loads all shard indexes into memory, which,
|
||||
depending on the number of series in the database, can take time.
|
||||
If using `tsi1` indexing, InfluxDB only loads hot shard indexes
|
||||
(the most recent shards or shards currently being written to) into memory and
|
||||
stores cold shard indexes on disk.
|
||||
Use `tsi1` indexing to see shorter startup times.
|
||||
{{% /note %}}
|
||||
|
||||
### Monitor
|
||||
The Monitor service provides statistical and diagnostic information to InfluxDB about InfluxDB itself.
|
||||
This information helps with database troubleshooting and performance analysis.
|
||||
|
||||
### Cluster
|
||||
The Cluster service provides implementations of InfluxDB OSS v1.8 interfaces
|
||||
that operate on an InfluxDB Enterprise v1.8 cluster.
|
||||
|
||||
### Precreator
|
||||
The Precreator service creates shards before they are needed.
|
||||
This ensures necessary shards exist before new time series data arrives and that
|
||||
write-throughput is not affected the creation of a new shard.
|
||||
|
||||
### Snapshotter
|
||||
The Snapshotter service routinely creates snapshots of InfluxDB Enterprise metadata.
|
||||
|
||||
### Continuous Query
|
||||
The Continuous Query (CQ) subsystem manages all InfluxDB CQs.
|
||||
|
||||
### Announcer
|
||||
The Announcer service announces a data node's status to meta nodes.
|
||||
|
||||
### Retention
|
||||
The Retention service enforces [retention policies](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp)
|
||||
and drops data as it expires.
|
||||
|
||||
### Stats
|
||||
The Stats service monitors cluster-level statistics.
|
||||
|
||||
### Anti-entropy
|
||||
The Anti-entropy (AE) subsystem is responsible for reconciling differences between shards.
|
||||
For more information, see [Use anti-entropy](/enterprise_influxdb/v1.10/administration/anti-entropy/).
|
||||
|
||||
### HTTP API
|
||||
The InfluxDB HTTP API service provides a public facing interface to interact with
|
||||
InfluxDB Enterprise and internal interfaces used within the InfluxDB Enterprise cluster.
|
||||
|
|
@ -0,0 +1,60 @@
|
|||
---
|
||||
title: InfluxDB design insights and tradeoffs
|
||||
description: >
|
||||
Optimizing for time series use case entails some tradeoffs, primarily to increase performance at the cost of functionality.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: InfluxDB design insights and tradeoffs
|
||||
weight: 40
|
||||
parent: Concepts
|
||||
v2: /influxdb/v2.0/reference/key-concepts/design-principles/
|
||||
---
|
||||
|
||||
InfluxDB is a time series database.
|
||||
Optimizing for this use case entails some tradeoffs, primarily to increase performance at the cost of functionality.
|
||||
Below is a list of some of those design insights that lead to tradeoffs:
|
||||
|
||||
1. For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times.
|
||||
|
||||
_**Pro:**_ Simplified [conflict resolution](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points) increases write performance.
|
||||
_**Con:**_ Cannot store duplicate data; may overwrite data in rare circumstances.
|
||||
|
||||
2. Deletes are a rare occurrence.
|
||||
When they do occur it is almost always against large ranges of old data that are cold for writes.
|
||||
|
||||
_**Pro:**_ Restricting access to deletes allows for increased query and write performance.
|
||||
_**Con:**_ Delete functionality is significantly restricted.
|
||||
|
||||
3. Updates to existing data are a rare occurrence and contentious updates never happen.
|
||||
Time series data is predominantly new data that is never updated.
|
||||
|
||||
_**Pro:**_ Restricting access to updates allows for increased query and write performance.
|
||||
_**Con:**_ Update functionality is significantly restricted.
|
||||
|
||||
4. The vast majority of writes are for data with very recent timestamps and the data is added in time ascending order.
|
||||
|
||||
_**Pro:**_ Adding data in time ascending order is significantly more performant.
|
||||
_**Con:**_ Writing points with random times or with time not in ascending order is significantly less performant.
|
||||
|
||||
5. Scale is critical.
|
||||
The database must be able to handle a *high* volume of reads and writes.
|
||||
|
||||
_**Pro:**_ The database can handle a *high* volume of reads and writes.
|
||||
_**Con:**_ The InfluxDB development team was forced to make tradeoffs to increase performance.
|
||||
|
||||
6. Being able to write and query the data is more important than having a strongly consistent view.
|
||||
|
||||
_**Pro:**_ Writing and querying the database can be done by multiple clients and at high loads.
|
||||
_**Con:**_ Query returns may not include the most recent points if database is under heavy load.
|
||||
|
||||
7. Many time [series](/enterprise_influxdb/v1.10/concepts/glossary/#series) are ephemeral.
|
||||
There are often time series that appear only for a few hours and then go away, e.g.
|
||||
a new host that gets started and reports for a while and then gets shut down.
|
||||
|
||||
_**Pro:**_ InfluxDB is good at managing discontinuous data.
|
||||
_**Con:**_ Schema-less design means that some database functions are not supported e.g. there are no cross table joins.
|
||||
|
||||
8. No one point is too important.
|
||||
|
||||
_**Pro:**_ InfluxDB has very powerful tools to deal with aggregate data and large data sets.
|
||||
_**Con:**_ Points don't have IDs in the traditional sense, they are differentiated by timestamp and series.
|
|
@ -0,0 +1,202 @@
|
|||
---
|
||||
title: InfluxDB key concepts
|
||||
description: Covers key concepts to learn about InfluxDB.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Key concepts
|
||||
weight: 10
|
||||
parent: Concepts
|
||||
v2: /influxdb/v2.0/reference/key-concepts/
|
||||
---
|
||||
|
||||
Before diving into InfluxDB, it's good to get acquainted with some key concepts of the database. This document introduces key InfluxDB concepts and elements. To introduce the key concepts, we’ll cover how the following elements work together in InfluxDB:
|
||||
|
||||
- [database](/enterprise_influxdb/v1.10/concepts/glossary/#database)
|
||||
- [field key](/enterprise_influxdb/v1.10/concepts/glossary/#field-key)
|
||||
- [field set](/enterprise_influxdb/v1.10/concepts/glossary/#field-set)
|
||||
- [field value](/enterprise_influxdb/v1.10/concepts/glossary/#field-value)
|
||||
- [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement)
|
||||
- [point](/enterprise_influxdb/v1.10/concepts/glossary/#point)
|
||||
- [retention policy](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp)
|
||||
- [series](/enterprise_influxdb/v1.10/concepts/glossary/#series)
|
||||
- [tag key](/enterprise_influxdb/v1.10/concepts/glossary/#tag-key)
|
||||
- [tag set](/enterprise_influxdb/v1.10/concepts/glossary/#tag-set)
|
||||
- [tag value](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value)
|
||||
- [timestamp](/enterprise_influxdb/v1.10/concepts/glossary/#timestamp)
|
||||
|
||||
|
||||
### Sample data
|
||||
|
||||
The next section references the data printed out below.
|
||||
The data is fictional, but represents a believable setup in InfluxDB.
|
||||
They show the number of butterflies and honeybees counted by two scientists (`langstroth` and `perpetua`) in two locations (location `1` and location `2`) over the time period from August 18, 2015 at midnight through August 18, 2015 at 6:12 AM.
|
||||
Assume that the data lives in a database called `my_database` and are subject to the `autogen` retention policy (more on databases and retention policies to come).
|
||||
|
||||
*Hint:* Hover over the links for tooltips to get acquainted with InfluxDB terminology and the layout.
|
||||
|
||||
**name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
|
||||
|
||||
| time | <span class ="tooltip" data-tooltip-text ="Field key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Field key">honeybees</span> | <span class ="tooltip" data-tooltip-text ="Tag key">location</span> | <span class ="tooltip" data-tooltip-text ="Tag key">scientist</span> |
|
||||
| ---- | ------------------------------------------------------------------------ | ---------------------------------------------------------------------- | ------------------------------------------------------------------- | -------------------------------------------------------------------- |
|
||||
| 2015-08-18T00:00:00Z | 12 | 23 | 1 | langstroth |
|
||||
| 2015-08-18T00:00:00Z | 1 | 30 | 1 | perpetua |
|
||||
| 2015-08-18T00:06:00Z | 11 | 28 | 1 | langstroth |
|
||||
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">3</span> | <span class ="tooltip" data-tooltip-text ="Field value">28</span> | <span class ="tooltip" data-tooltip-text ="Tag value">1</span> | <span class ="tooltip" data-tooltip-text ="Tag value">perpetua</span> |
|
||||
| 2015-08-18T05:54:00Z | 2 | 11 | 2 | langstroth |
|
||||
| 2015-08-18T06:00:00Z | 1 | 10 | 2 | langstroth |
|
||||
| 2015-08-18T06:06:00Z | 8 | 23 | 2 | perpetua |
|
||||
| 2015-08-18T06:12:00Z | 7 | 22 | 2 | perpetua |
|
||||
|
||||
### Discussion
|
||||
|
||||
Now that you've seen some sample data in InfluxDB this section covers what it all means.
|
||||
|
||||
InfluxDB is a time series database so it makes sense to start with what is at the root of everything we do: time.
|
||||
In the data above there's a column called `time` - all data in InfluxDB have that column.
|
||||
`time` stores timestamps, and the <a name="timestamp"></a>_**timestamp**_ shows the date and time, in [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) UTC, associated with particular data.
|
||||
|
||||
The next two columns, called `butterflies` and `honeybees`, are fields.
|
||||
Fields are made up of field keys and field values.
|
||||
<a name="field-key"></a>_**Field keys**_ (`butterflies` and `honeybees`) are strings; the field key `butterflies` tells us that the field values `12`-`7` refer to butterflies and the field key `honeybees` tells us that the field values `23`-`22` refer to, well, honeybees.
|
||||
|
||||
<a name="field-value"></a>_**Field values**_ are your data; they can be strings, floats, integers, or Booleans, and, because InfluxDB is a time series database, a field value is always associated with a timestamp.
|
||||
The field values in the sample data are:
|
||||
|
||||
```
|
||||
12 23
|
||||
1 30
|
||||
11 28
|
||||
3 28
|
||||
2 11
|
||||
1 10
|
||||
8 23
|
||||
7 22
|
||||
```
|
||||
|
||||
In the data above, the collection of field-key and field-value pairs make up a <a name="field-set"></a>_**field set**_.
|
||||
Here are all eight field sets in the sample data:
|
||||
|
||||
* `butterflies = 12 honeybees = 23`
|
||||
* `butterflies = 1 honeybees = 30`
|
||||
* `butterflies = 11 honeybees = 28`
|
||||
* `butterflies = 3 honeybees = 28`
|
||||
* `butterflies = 2 honeybees = 11`
|
||||
* `butterflies = 1 honeybees = 10`
|
||||
* `butterflies = 8 honeybees = 23`
|
||||
* `butterflies = 7 honeybees = 22`
|
||||
|
||||
Fields are a required piece of the InfluxDB data structure - you cannot have data in InfluxDB without fields.
|
||||
It's also important to note that fields are not indexed.
|
||||
[Queries](/enterprise_influxdb/v1.10/concepts/glossary/#query) that use field values as filters must scan all values that match the other conditions in the query.
|
||||
As a result, those queries are not performant relative to queries on tags (more on tags below).
|
||||
In general, fields should not contain commonly-queried metadata.
|
||||
|
||||
The last two columns in the sample data, called `location` and `scientist`, are tags.
|
||||
Tags are made up of tag keys and tag values.
|
||||
Both <a name="tag-key"></a>_**tag keys**_ and <a name="tag-value"></a>_**tag values**_ are stored as strings and record metadata.
|
||||
The tag keys in the sample data are `location` and `scientist`.
|
||||
The tag key `location` has two tag values: `1` and `2`.
|
||||
The tag key `scientist` also has two tag values: `langstroth` and `perpetua`.
|
||||
|
||||
In the data above, the <a name="tag-set"></a>_**tag set**_ is the different combinations of all the tag key-value pairs.
|
||||
The four tag sets in the sample data are:
|
||||
|
||||
* `location = 1`, `scientist = langstroth`
|
||||
* `location = 2`, `scientist = langstroth`
|
||||
* `location = 1`, `scientist = perpetua`
|
||||
* `location = 2`, `scientist = perpetua`
|
||||
|
||||
Tags are optional.
|
||||
You don't need to have tags in your data structure, but it's generally a good idea to make use of them because, unlike fields, tags are indexed.
|
||||
This means that queries on tags are faster and that tags are ideal for storing commonly-queried metadata.
|
||||
|
||||
Avoid using the following reserved keys:
|
||||
|
||||
* `_field`
|
||||
* `_measurement`
|
||||
* `time`
|
||||
|
||||
If reserved keys are included as a tag or field key, the associated point is discarded.
|
||||
|
||||
> **Why indexing matters: The schema case study**
|
||||
|
||||
> Say you notice that most of your queries focus on the values of the field keys `honeybees` and `butterflies`:
|
||||
|
||||
> `SELECT * FROM "census" WHERE "butterflies" = 1`
|
||||
> `SELECT * FROM "census" WHERE "honeybees" = 23`
|
||||
|
||||
> Because fields aren't indexed, InfluxDB scans every value of `butterflies` in the first query and every value of `honeybees` in the second query before it provides a response.
|
||||
That behavior can hurt query response times - especially on a much larger scale.
|
||||
To optimize your queries, it may be beneficial to rearrange your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) such that the fields (`butterflies` and `honeybees`) become the tags and the tags (`location` and `scientist`) become the fields:
|
||||
|
||||
> **name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
|
||||
>
|
||||
| time | <span class ="tooltip" data-tooltip-text ="Field key">location</span> | <span class ="tooltip" data-tooltip-text ="Field key">scientist</span> | <span class ="tooltip" data-tooltip-text ="Tag key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Tag key">honeybees</span> |
|
||||
| ---- | --------------------------------------------------------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------- |
|
||||
| 2015-08-18T00:00:00Z | 1 | langstroth | 12 | 23 |
|
||||
| 2015-08-18T00:00:00Z | 1 | perpetua | 1 | 30 |
|
||||
| 2015-08-18T00:06:00Z | 1 | langstroth | 11 | 28 |
|
||||
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">1</span> | <span class ="tooltip" data-tooltip-text ="Field value">perpetua</span> | <span class ="tooltip" data-tooltip-text ="Tag value">3</span> | <span class ="tooltip" data-tooltip-text ="Tag value">28</span> |
|
||||
| 2015-08-18T05:54:00Z | 2 | langstroth | 2 | 11 |
|
||||
| 2015-08-18T06:00:00Z | 2 | langstroth | 1 | 10 |
|
||||
| 2015-08-18T06:06:00Z | 2 | perpetua | 8 | 23 |
|
||||
| 2015-08-18T06:12:00Z | 2 | perpetua | 7 | 22 |
|
||||
|
||||
> Now that `butterflies` and `honeybees` are tags, InfluxDB won't have to scan every one of their values when it performs the queries above - this means that your queries are even faster.
|
||||
|
||||
The <a name=measurement></a>_**measurement**_ acts as a container for tags, fields, and the `time` column, and the measurement name is the description of the data that are stored in the associated fields.
|
||||
Measurement names are strings, and, for any SQL users out there, a measurement is conceptually similar to a table.
|
||||
The only measurement in the sample data is `census`.
|
||||
The name `census` tells us that the field values record the number of `butterflies` and `honeybees` - not their size, direction, or some sort of happiness index.
|
||||
|
||||
A single measurement can belong to different retention policies.
|
||||
A <a name="retention-policy"></a>_**retention policy**_ describes how long InfluxDB keeps data (`DURATION`) and how many copies of this data is stored in the cluster (`REPLICATION`).
|
||||
If you're interested in reading more about retention policies, check out [Database Management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
{{% warn %}} Replication factors do not serve a purpose with single node instances.
|
||||
{{% /warn %}}
|
||||
|
||||
In the sample data, everything in the `census` measurement belongs to the `autogen` retention policy.
|
||||
InfluxDB automatically creates that retention policy; it has an infinite duration and a replication factor set to one.
|
||||
|
||||
Now that you're familiar with measurements, tag sets, and retention policies, let's discuss series.
|
||||
In InfluxDB, a <a name=series></a>_**series**_ is a collection of points that share a measurement, tag set, and field key.
|
||||
The data above consist of eight series:
|
||||
|
||||
| Series number | Measurement | Tag set | Field key |
|
||||
|:------------------------ | ----------- | ------- | --------- |
|
||||
| series 1 | `census` | `location = 1`,`scientist = langstroth` | `butterflies` |
|
||||
| series 2 | `census` | `location = 2`,`scientist = langstroth` | `butterflies` |
|
||||
| series 3 | `census` | `location = 1`,`scientist = perpetua` | `butterflies` |
|
||||
| series 4 | `census` | `location = 2`,`scientist = perpetua` | `butterflies` |
|
||||
| series 5 | `census` | `location = 1`,`scientist = langstroth` | `honeybees` |
|
||||
| series 6 | `census` | `location = 2`,`scientist = langstroth` | `honeybees` |
|
||||
| series 7 | `census` | `location = 1`,`scientist = perpetua` | `honeybees` |
|
||||
| series 8 | `census` | `location = 2`,`scientist = perpetua` | `honeybees` |
|
||||
|
||||
Understanding the concept of a series is essential when designing your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) and when working with your data in InfluxDB.
|
||||
|
||||
A <a name="point"></a>_**point**_ represents a single data record that has four components: a measurement, tag set, field set, and a timestamp. A point is uniquely identified by its series and timestamp.
|
||||
|
||||
For example, here's a single point:
|
||||
```
|
||||
name: census
|
||||
-----------------
|
||||
time butterflies honeybees location scientist
|
||||
2015-08-18T00:00:00Z 1 30 1 perpetua
|
||||
```
|
||||
|
||||
The point in this example is part of series 3 and 7 and defined by the measurement (`census`), the tag set (`location = 1`, `scientist = perpetua`), the field set (`butterflies = 1`, `honeybees = 30`), and the timestamp `2015-08-18T00:00:00Z`.
|
||||
|
||||
All of the stuff we've just covered is stored in a database - the sample data are in the database `my_database`.
|
||||
An InfluxDB <a name=database></a>_**database**_ is similar to traditional relational databases and serves as a logical container for users, retention policies, continuous queries, and, of course, your time series data.
|
||||
See [Authentication and Authorization](/enterprise_influxdb/v1.10/administration/authentication_and_authorization/) and [Continuous Queries](/enterprise_influxdb/v1.10/query_language/continuous_queries/) for more on those topics.
|
||||
|
||||
Databases can have several users, continuous queries, retention policies, and measurements.
|
||||
InfluxDB is a schemaless database which means it's easy to add new measurements, tags, and fields at any time.
|
||||
It's designed to make working with time series data awesome.
|
||||
|
||||
You made it!
|
||||
You've covered the fundamental concepts and terminology in InfluxDB.
|
||||
If you're just starting out, we recommend taking a look at [Getting Started](/enterprise_influxdb/v1.10/introduction/getting_started/) and the [Writing Data](/enterprise_influxdb/v1.10/guides/writing_data/) and [Querying Data](/enterprise_influxdb/v1.10/guides/querying_data/) guides.
|
||||
May our time series database serve you well 🕔.
|
|
@ -0,0 +1,279 @@
|
|||
---
|
||||
title: InfluxDB schema design and data layout
|
||||
description: >
|
||||
General guidelines for InfluxDB schema design and data layout.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Schema design and data layout
|
||||
weight: 50
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
Each InfluxDB use case is unique and your [schema](/enterprise_influxdb/v1.10/concepts/glossary/#schema) reflects that uniqueness.
|
||||
In general, a schema designed for querying leads to simpler and more performant queries.
|
||||
We recommend the following design guidelines for most use cases:
|
||||
|
||||
- [Where to store data (tag or field)](#where-to-store-data-tag-or-field)
|
||||
- [Avoid too many series](#avoid-too-many-series)
|
||||
- [Use recommended naming conventions](#use-recommended-naming-conventions)
|
||||
- [Shard Group Duration Management](#shard-group-duration-management)
|
||||
|
||||
## Where to store data (tag or field)
|
||||
|
||||
Your queries should guide what data you store in [tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag) and what you store in [fields](/enterprise_influxdb/v1.10/concepts/glossary/#field) :
|
||||
|
||||
- Store commonly-queried and grouping ([`group()`](/flux/v0.x/stdlib/universe/group) or [`GROUP BY`](/enterprise_influxdb/v1.10/query_language/explore-data/#group-by-tags)) metadata in tags.
|
||||
- Store data in fields if each data point contains a different value.
|
||||
- Store numeric values as fields ([tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) only support string values).
|
||||
|
||||
## Avoid too many series
|
||||
|
||||
IndexDB indexes the following data elements to speed up reads:
|
||||
|
||||
- [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement)
|
||||
- [tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag)
|
||||
|
||||
[Tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) are indexed and [field values](/enterprise_influxdb/v1.10/concepts/glossary/#field-value) are not.
|
||||
This means that querying by tags is more performant than querying by fields.
|
||||
However, when too many indexes are created, both writes and reads may start to slow down.
|
||||
|
||||
Each unique set of indexed data elements forms a [series key](/enterprise_influxdb/v1.10/concepts/glossary/#series-key).
|
||||
[Tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/enterprise_influxdb/v1.10/concepts/glossary/#series), also known as high [series cardinality](/enterprise_influxdb/v1.10/concepts/glossary/#series-cardinality).
|
||||
High series cardinality is a primary driver of high memory usage for many database workloads.
|
||||
Therefore, to reduce memory consumption, consider storing high-cardinality values in field values rather than in tags or field keys.
|
||||
|
||||
{{% note %}}
|
||||
|
||||
If reads and writes to InfluxDB start to slow down, you may have high series cardinality (too many series).
|
||||
See [how to find and reduce high series cardinality](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
|
||||
|
||||
{{% /note %}}
|
||||
|
||||
## Use recommended naming conventions
|
||||
|
||||
Use the following conventions when naming your tag and field keys:
|
||||
|
||||
- [Avoid reserved keywords in tag and field keys](#avoid-reserved-keywords-in-tag-and-field-keys)
|
||||
- [Avoid the same tag and field name](#avoid-the-same-name-for-a-tag-and-a-field)
|
||||
- [Avoid encoding data in measurements and keys](#avoid-encoding-data-in-measurements-and-keys)
|
||||
- [Avoid more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
|
||||
|
||||
### Avoid reserved keywords in tag and field keys
|
||||
|
||||
Not required, but avoiding the use of reserved keywords in your tag keys and field keys simplifies writing queries because you won't have to wrap your keys in double quotes.
|
||||
See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) to avoid.
|
||||
|
||||
Also, if a tag key or field key contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in Flux.
|
||||
|
||||
### Avoid the same name for a tag and a field
|
||||
|
||||
Avoid using the same name for a tag and field key.
|
||||
This often results in unexpected behavior when querying data.
|
||||
|
||||
If you inadvertently add the same name for a tag and a field, see
|
||||
[Frequently asked questions](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#tag-and-field-key-with-the-same-name)
|
||||
for information about how to query the data predictably and how to fix the issue.
|
||||
|
||||
### Avoid encoding data in measurements and keys
|
||||
|
||||
Store data in [tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) or [field values](/enterprise_influxdb/v1.10/concepts/glossary/#field-value), not in [tag keys](/enterprise_influxdb/v1.10/concepts/glossary/#tag-key), [field keys](/enterprise_influxdb/v1.10/concepts/glossary/#field-key), or [measurements](/enterprise_influxdb/v1.10/concepts/glossary/#measurement). If you design your schema to store data in tag and field values,
|
||||
your queries will be easier to write and more efficient.
|
||||
|
||||
In addition, you'll keep cardinality low by not creating measurements and keys as you write data.
|
||||
To learn more about the performance impact of high series cardinality, see [how to find and reduce high series cardinality](/enterprise_influxdb/v1.10/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
|
||||
|
||||
#### Compare schemas
|
||||
|
||||
Compare the following valid schemas represented by line protocol.
|
||||
|
||||
**Recommended**: the following schema stores metadata in separate `crop`, `plot`, and `region` tags. The `temp` field contains variable numeric data.
|
||||
|
||||
##### {id="good-measurements-schema"}
|
||||
```
|
||||
Good Measurements schema - Data encoded in tags (recommended)
|
||||
-------------
|
||||
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
|
||||
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the measurement, similar to Graphite metrics.
|
||||
|
||||
##### {id="bad-measurements-schema"}
|
||||
```
|
||||
Bad Measurements schema - Data encoded in the measurement (not recommended)
|
||||
-------------
|
||||
blueberries.plot-1.north temp=50.1 1472515200000000000
|
||||
blueberries.plot-2.midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the field key.
|
||||
|
||||
##### {id="bad-keys-schema"}
|
||||
```
|
||||
Bad Keys schema - Data encoded in field keys (not recommended)
|
||||
-------------
|
||||
weather_sensor blueberries.plot-1.north.temp=50.1 1472515200000000000
|
||||
weather_sensor blueberries.plot-2.midwest.temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
#### Compare queries
|
||||
|
||||
Compare the following queries of the [_Good Measurements_](#good-measurements-schema) and [_Bad Measurements_](#bad-measurements-schema) schemas.
|
||||
The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region
|
||||
|
||||
**Easy to query**: [_Good Measurements_](#good-measurements-schema) data is easily filtered by `region` tag values, as in the following example.
|
||||
|
||||
```js
|
||||
// Query *Good Measurements*, data stored in separate tags (recommended)
|
||||
from(bucket: "<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|
||||
|> mean()
|
||||
```
|
||||
|
||||
**Difficult to query**: [_Bad Measurements_](#bad-measurements-schema) requires regular expressions to extract `plot` and `region` from the measurement, as in the following example.
|
||||
|
||||
```js
|
||||
// Query *Bad Measurements*, data encoded in the measurement (not recommended)
|
||||
from(bucket: "<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
|
||||
|> mean()
|
||||
```
|
||||
|
||||
Complex measurements make some queries impossible. For example, calculating the average temperature of both plots is not possible with the [_Bad Measurements_](#bad-measurements-schema) schema.
|
||||
|
||||
|
||||
##### InfluxQL example to query schemas
|
||||
|
||||
```
|
||||
# Query *Bad Measurements*, data encoded in the measurement (not recommended)
|
||||
> SELECT mean("temp") FROM /\.north$/
|
||||
|
||||
# Query *Good Measurements*, data stored in separate tag values (recommended)
|
||||
> SELECT mean("temp") FROM "weather_sensor" WHERE "region" = 'north'
|
||||
```
|
||||
|
||||
### Avoid putting more than one piece of information in one tag
|
||||
|
||||
Splitting a single tag with multiple pieces into separate tags simplifies your queries and improves performance by
|
||||
reducing the need for regular expressions.
|
||||
|
||||
Consider the following schema represented by line protocol.
|
||||
|
||||
#### Example line protocol schemas
|
||||
|
||||
```
|
||||
Schema 1 - Multiple data encoded in a single tag
|
||||
-------------
|
||||
weather_sensor,crop=blueberries,location=plot-1.north temp=50.1 1472515200000000000
|
||||
weather_sensor,crop=blueberries,location=plot-2.midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
The Schema 1 data encodes multiple separate parameters, the `plot` and `region` into a long tag value (`plot-1.north`).
|
||||
Compare this to the following schema represented in line protocol.
|
||||
|
||||
```
|
||||
Schema 2 - Data encoded in multiple tags
|
||||
-------------
|
||||
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
|
||||
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region.
|
||||
Schema 2 is preferable because using multiple tags, you don't need a regular expression.
|
||||
|
||||
#### Flux example to query schemas
|
||||
|
||||
```js
|
||||
// Schema 1 - Query for multiple data encoded in a single tag
|
||||
from(bucket:"<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
|
||||
|> mean()
|
||||
|
||||
// Schema 2 - Query for data encoded in multiple tags
|
||||
from(bucket:"<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|
||||
|> mean()
|
||||
```
|
||||
|
||||
#### InfluxQL example to query schemas
|
||||
|
||||
```
|
||||
# Schema 1 - Query for multiple data encoded in a single tag
|
||||
> SELECT mean("temp") FROM "weather_sensor" WHERE location =~ /\.north$/
|
||||
|
||||
# Schema 2 - Query for data encoded in multiple tags
|
||||
> SELECT mean("temp") FROM "weather_sensor" WHERE region = 'north'
|
||||
```
|
||||
|
||||
## Shard group duration management
|
||||
|
||||
### Shard group duration overview
|
||||
|
||||
InfluxDB stores data in shard groups.
|
||||
Shard groups are organized by [retention policy](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp) (RP) and store data with timestamps that fall within a specific time interval called the [shard duration](/enterprise_influxdb/v1.10/concepts/glossary/#shard-duration).
|
||||
|
||||
If no shard group duration is provided, the shard group duration is determined by the RP [duration](/enterprise_influxdb/v1.10/concepts/glossary/#duration) at the time the RP is created. The default values are:
|
||||
|
||||
| RP Duration | Shard Group Duration |
|
||||
|---|---|
|
||||
| < 2 days | 1 hour |
|
||||
| >= 2 days and <= 6 months | 1 day |
|
||||
| > 6 months | 7 days |
|
||||
|
||||
The shard group duration is also configurable per RP.
|
||||
To configure the shard group duration, see [Retention Policy Management](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
### Shard group duration tradeoffs
|
||||
|
||||
Determining the optimal shard group duration requires finding the balance between:
|
||||
|
||||
- Better overall performance with longer shards
|
||||
- Flexibility provided by shorter shards
|
||||
|
||||
#### Long shard group duration
|
||||
|
||||
Longer shard group durations let InfluxDB store more data in the same logical location.
|
||||
This reduces data duplication, improves compression efficiency, and improves query speed in some cases.
|
||||
|
||||
#### Short shard group duration
|
||||
|
||||
Shorter shard group durations allow the system to more efficiently drop data and record incremental backups.
|
||||
When InfluxDB enforces an RP it drops entire shard groups, not individual data points, even if the points are older than the RP duration.
|
||||
A shard group will only be removed once a shard group's duration *end time* is older than the RP duration.
|
||||
|
||||
For example, if your RP has a duration of one day, InfluxDB will drop an hour's worth of data every hour and will always have 25 shard groups. One for each hour in the day and an extra shard group that is partially expiring, but isn't removed until the whole shard group is older than 24 hours.
|
||||
|
||||
>**Note:** A special use case to consider: filtering queries on schema data (such as tags, series, measurements) by time. For example, if you want to filter schema data within a one hour interval, you must set the shard group duration to 1h. For more information, see [filter schema data by time](/enterprise_influxdb/v1.10/query_language/explore-schema/#filter-meta-queries-by-time).
|
||||
|
||||
### Shard group duration recommendations
|
||||
|
||||
The default shard group durations work well for most cases. However, high-throughput or long-running instances will benefit from using longer shard group durations.
|
||||
Here are some recommendations for longer shard group durations:
|
||||
|
||||
| RP Duration | Shard Group Duration |
|
||||
|---|---|
|
||||
| <= 1 day | 6 hours |
|
||||
| > 1 day and <= 7 days | 1 day |
|
||||
| > 7 days and <= 3 months | 7 days |
|
||||
| > 3 months | 30 days |
|
||||
| infinite | 52 weeks or longer |
|
||||
|
||||
> **Note:** Note that `INF` (infinite) is not a [valid shard group duration](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
|
||||
In extreme cases where data covers decades and will never be deleted, a long shard group duration like `1040w` (20 years) is perfectly valid.
|
||||
|
||||
Other factors to consider before setting shard group duration:
|
||||
|
||||
* Shard groups should be twice as long as the longest time range of the most frequent queries
|
||||
* Shard groups should each contain more than 100,000 [points](/enterprise_influxdb/v1.10/concepts/glossary/#point) per shard group
|
||||
* Shard groups should each contain more than 1,000 points per [series](/enterprise_influxdb/v1.10/concepts/glossary/#series)
|
||||
|
||||
#### Shard group duration for backfilling
|
||||
|
||||
Bulk insertion of historical data covering a large time range in the past will trigger the creation of a large number of shards at once.
|
||||
The concurrent access and overhead of writing to hundreds or thousands of shards can quickly lead to slow performance and memory exhaustion.
|
||||
|
||||
When writing historical data, we highly recommend temporarily setting a longer shard group duration so fewer shards are created. Typically, a shard group duration of 52 weeks works well for backfilling.
|
|
@ -0,0 +1,438 @@
|
|||
---
|
||||
title: In-memory indexing and the Time-Structured Merge Tree (TSM)
|
||||
description: >
|
||||
InfluxDB storage engine, in-memory indexing, and the Time-Structured Merge Tree (TSM) in InfluxDB OSS.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: In-memory indexing with TSM
|
||||
weight: 60
|
||||
parent: Concepts
|
||||
v2: /influxdb/v2.0/reference/internals/storage-engine/
|
||||
---
|
||||
|
||||
## The InfluxDB storage engine and the Time-Structured Merge Tree (TSM)
|
||||
|
||||
The InfluxDB storage engine looks very similar to a LSM Tree.
|
||||
It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree.
|
||||
TSM files contain sorted, compressed series data.
|
||||
|
||||
InfluxDB will create a [shard](/enterprise_influxdb/v1.10/concepts/glossary/#shard) for each block of time.
|
||||
For example, if you have a [retention policy](/enterprise_influxdb/v1.10/concepts/glossary/#retention-policy-rp) with an unlimited duration, shards will be created for each 7 day block of time.
|
||||
Each of these shards maps to an underlying storage engine database.
|
||||
Each of these databases has its own [WAL](/enterprise_influxdb/v1.10/concepts/glossary/#wal-write-ahead-log) and TSM files.
|
||||
|
||||
We'll dig into each of these parts of the storage engine.
|
||||
|
||||
## Storage engine
|
||||
|
||||
The storage engine ties a number of components together and provides the external interface for storing and querying series data. It is composed of a number of components that each serve a particular role:
|
||||
|
||||
* In-Memory Index - The in-memory index is a shared index across shards that provides the quick access to [measurements](/enterprise_influxdb/v1.10/concepts/glossary/#measurement), [tags](/enterprise_influxdb/v1.10/concepts/glossary/#tag), and [series](/enterprise_influxdb/v1.10/concepts/glossary/#series). The index is used by the engine, but is not specific to the storage engine itself.
|
||||
* WAL - The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable. Writes to the WAL are appended to segments of a fixed size.
|
||||
* Cache - The Cache is an in-memory representation of the data stored in the WAL. It is queried at runtime and merged with the data stored in TSM files.
|
||||
* TSM Files - TSM files store compressed series data in a columnar format.
|
||||
* FileStore - The FileStore mediates access to all TSM files on disk. It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used.
|
||||
* Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones.
|
||||
* Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other.
|
||||
* Compression - Compression is handled by various Encoders and Decoders for specific data types. Some encoders are fairly static and always encode the same type the same way; others switch their compression strategy based on the shape of the data.
|
||||
* Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats.
|
||||
|
||||
### Write Ahead Log (WAL)
|
||||
|
||||
The WAL is organized as a bunch of files that look like `_000001.wal`.
|
||||
The file numbers are monotonically increasing and referred to as WAL segments.
|
||||
When a segment reaches 10MB in size, it is closed and a new one is opened. Each WAL segment stores multiple compressed blocks of writes and deletes.
|
||||
|
||||
When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file.
|
||||
The file is `fsync`'d and the data is added to an in-memory index before a success is returned.
|
||||
This means that batching points together is required to achieve high throughput performance.
|
||||
(Optimal batch size seems to be 5,000-10,000 points per batch for many use cases.)
|
||||
|
||||
Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block.
|
||||
|
||||
### Cache
|
||||
|
||||
The Cache is an in-memory copy of all data points current stored in the WAL.
|
||||
The points are organized by the key, which is the measurement, [tag set](/enterprise_influxdb/v1.10/concepts/glossary/#tag-set), and unique [field](/enterprise_influxdb/v1.10/concepts/glossary/#field).
|
||||
Each field is kept as its own time-ordered range.
|
||||
The Cache data is not compressed while in memory.
|
||||
|
||||
Queries to the storage engine will merge data from the Cache with data from the TSM files.
|
||||
Queries execute on a copy of the data that is made from the cache at query processing time.
|
||||
This way writes that come in while a query is running won't affect the result.
|
||||
|
||||
Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
|
||||
|
||||
The Cache exposes a few controls for snapshotting behavior.
|
||||
The two most important controls are the memory limits.
|
||||
There is a lower bound, [`cache-snapshot-memory-size`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#cache-snapshot-memory-size), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments.
|
||||
There is also an upper bound, [`cache-max-memory-size`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes.
|
||||
These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it.
|
||||
The checks for memory thresholds occur on every write.
|
||||
|
||||
The other snapshot controls are time based.
|
||||
The idle threshold, [`cache-snapshot-write-cold-duration`](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes#cache-snapshot-write-cold-duration), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval.
|
||||
|
||||
The in-memory Cache is recreated on restart by re-reading the WAL files on disk.
|
||||
|
||||
### TSM files
|
||||
|
||||
TSM files are a collection of read-only files that are memory mapped.
|
||||
The structure of these files looks very similar to an SSTable in LevelDB or other LSM Tree variants.
|
||||
|
||||
A TSM file is composed of four sections: header, blocks, index, and footer.
|
||||
|
||||
```
|
||||
+--------+------------------------------------+-------------+--------------+
|
||||
| Header | Blocks | Index | Footer |
|
||||
|5 bytes | N bytes | N bytes | 4 bytes |
|
||||
+--------+------------------------------------+-------------+--------------+
|
||||
```
|
||||
|
||||
The Header is a magic number to identify the file type and a version number.
|
||||
|
||||
```
|
||||
+-------------------+
|
||||
| Header |
|
||||
+-------------------+
|
||||
| Magic │ Version |
|
||||
| 4 bytes │ 1 byte |
|
||||
+-------------------+
|
||||
```
|
||||
|
||||
Blocks are sequences of pairs of CRC32 checksums and data.
|
||||
The block data is opaque to the file.
|
||||
The CRC32 is used for block level error detection.
|
||||
The length of the blocks is stored in the index.
|
||||
|
||||
```
|
||||
+--------------------------------------------------------------------+
|
||||
│ Blocks │
|
||||
+---------------------+-----------------------+----------------------+
|
||||
| Block 1 | Block 2 | Block N |
|
||||
+---------------------+-----------------------+----------------------+
|
||||
| CRC | Data | CRC | Data | CRC | Data |
|
||||
| 4 bytes | N bytes | 4 bytes | N bytes | 4 bytes | N bytes |
|
||||
+---------------------+-----------------------+----------------------+
|
||||
```
|
||||
|
||||
Following the blocks is the index for the blocks in the file.
|
||||
The index is composed of a sequence of index entries ordered lexicographically by key and then by time.
|
||||
The key includes the measurement name, tag set, and one field.
|
||||
Multiple fields per point creates multiple index entries in the TSM file.
|
||||
Each index entry starts with a key length and the key, followed by the block type (float, int, bool, string) and a count of the number of index block entries that follow for that key.
|
||||
Each index block entry is composed of the min and max time for the block, the offset into the file where the block is located and the size of the block. There is one index block entry for each block in the TSM file that contains the key.
|
||||
|
||||
The index structure can provide efficient access to all blocks as well as the ability to determine the cost associated with accessing a given key.
|
||||
Given a key and timestamp, we can determine whether a file contains the block for that timestamp.
|
||||
We can also determine where that block resides and how much data must be read to retrieve the block.
|
||||
Knowing the size of the block, we can efficiently provision our IO statements.
|
||||
|
||||
```
|
||||
+-----------------------------------------------------------------------------+
|
||||
│ Index │
|
||||
+-----------------------------------------------------------------------------+
|
||||
│ Key Len │ Key │ Type │ Count │Min Time │Max Time │ Offset │ Size │...│
|
||||
│ 2 bytes │ N bytes │1 byte│2 bytes│ 8 bytes │ 8 bytes │8 bytes │4 bytes │ │
|
||||
+-----------------------------------------------------------------------------+
|
||||
```
|
||||
|
||||
The last section is the footer that stores the offset of the start of the index.
|
||||
|
||||
```
|
||||
+---------+
|
||||
│ Footer │
|
||||
+---------+
|
||||
│Index Ofs│
|
||||
│ 8 bytes │
|
||||
+---------+
|
||||
```
|
||||
|
||||
### Compression
|
||||
|
||||
Each block is compressed to reduce storage space and disk IO when querying.
|
||||
A block contains the timestamps and values for a given series and field.
|
||||
Each block has one byte header, followed by the compressed timestamps and then the compressed values.
|
||||
|
||||
```
|
||||
+--------------------------------------------------+
|
||||
| Type | Len | Timestamps | Values |
|
||||
|1 Byte | VByte | N Bytes | N Bytes │
|
||||
+--------------------------------------------------+
|
||||
```
|
||||
|
||||
The timestamps and values are compressed and stored separately using encodings dependent on the data type and its shape.
|
||||
Storing them independently allows timestamp encoding to be used for all timestamps, while allowing different encodings for different field types.
|
||||
For example, some points may be able to use run-length encoding whereas other may not.
|
||||
|
||||
Each value type also contains a 1 byte header indicating the type of compression for the remaining bytes.
|
||||
The four high bits store the compression type and the four low bits are used by the encoder if needed.
|
||||
|
||||
#### Timestamps
|
||||
|
||||
Timestamp encoding is adaptive and based on the structure of the timestamps that are encoded.
|
||||
It uses a combination of delta encoding, scaling, and compression using simple8b run-length encoding, as well as falling back to no compression if needed.
|
||||
|
||||
Timestamp resolution is variable but can be as granular as a nanosecond, requiring up to 8 bytes to store uncompressed.
|
||||
During encoding, the values are first delta-encoded.
|
||||
The first value is the starting timestamp and subsequent values are the differences from the prior value.
|
||||
This usually converts the values into much smaller integers that are easier to compress.
|
||||
Many timestamps are also monotonically increasing and fall on even boundaries of time such as every 10s.
|
||||
When timestamps have this structure, they are scaled by the largest common divisor that is also a factor of 10.
|
||||
This has the effect of converting very large integer deltas into smaller ones that compress even better.
|
||||
|
||||
Using these adjusted values, if all the deltas are the same, the time range is stored using run-length encoding.
|
||||
If run-length encoding is not possible and all values are less than (1 << 60) - 1 ([~36.5 years](https://www.wolframalpha.com/input/?i=\(1+%3C%3C+60\)+-+1+nanoseconds+to+years) at nanosecond resolution), then the timestamps are encoded using [simple8b encoding](https://github.com/jwilder/encoding/tree/master/simple8b).
|
||||
Simple8b encoding is a 64bit word-aligned integer encoding that packs multiple integers into a single 64bit word.
|
||||
If any value exceeds the maximum the deltas are stored uncompressed using 8 bytes each for the block.
|
||||
Future encodings may use a patched scheme such as Patched Frame-Of-Reference (PFOR) to handle outliers more effectively.
|
||||
|
||||
#### Floats
|
||||
|
||||
Floats are encoded using an implementation of the [Facebook Gorilla paper](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf).
|
||||
The encoding XORs consecutive values together to produce a small result when the values are close together.
|
||||
The delta is then stored using control bits to indicate how many leading and trailing zeroes are in the XOR value.
|
||||
Our implementation removes the timestamp encoding described in paper and only encodes the float values.
|
||||
|
||||
#### Integers
|
||||
|
||||
Integer encoding uses two different strategies depending on the range of values in the uncompressed data.
|
||||
Encoded values are first encoded using [ZigZag encoding](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers).
|
||||
This interleaves positive and negative integers across a range of positive integers.
|
||||
|
||||
For example, [-2,-1,0,1] becomes [3,1,0,2].
|
||||
See Google's [Protocol Buffers documentation](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers) for more information.
|
||||
|
||||
If all ZigZag encoded values are less than (1 << 60) - 1, they are compressed using simple8b encoding.
|
||||
If any values are larger than the maximum then all values are stored uncompressed in the block.
|
||||
If all values are identical, run-length encoding is used.
|
||||
This works very well for values that are frequently constant.
|
||||
|
||||
#### Booleans
|
||||
|
||||
Booleans are encoded using a simple bit packing strategy where each Boolean uses 1 bit.
|
||||
The number of Booleans encoded is stored using variable-byte encoding at the beginning of the block.
|
||||
|
||||
#### Strings
|
||||
Strings are encoding using [Snappy](http://google.github.io/snappy/) compression.
|
||||
Each string is packed consecutively and they are compressed as one larger block.
|
||||
|
||||
### Compactions
|
||||
|
||||
Compactions are recurring processes that migrate data stored in a write-optimized format into a more read-optimized format.
|
||||
There are a number of stages of compaction that take place while a shard is hot for writes:
|
||||
|
||||
- **Snapshots** - Values in the Cache and WAL must be converted to TSM files to free memory and disk space used by the WAL segments.
|
||||
These compactions occur based on the cache memory and time thresholds.
|
||||
- **Level Compactions** - Level compactions (levels 1-4) occur as the TSM files grow.
|
||||
TSM files are compacted from snapshots to level 1 files.
|
||||
Multiple level 1 files are compacted to produce level 2 files.
|
||||
The process continues until files reach level 4 (full compaction) and the max size for a TSM file.
|
||||
They will not be compacted further unless deletes, index optimization compactions, or full compactions need to run.
|
||||
Lower level compactions use strategies that avoid CPU-intensive activities like decompressing and combining blocks.
|
||||
Higher level (and thus less frequent) compactions will re-combine blocks to fully compact them and increase the compression ratio.
|
||||
- **Index Optimization** - When many level 4 TSM files accumulate, the internal indexes become larger and more costly to access.
|
||||
An index optimization compaction splits the series and indices across a new set of TSM files, sorting all points for a given series into one TSM file.
|
||||
Before an index optimization, each TSM file contained points for most or all series, and thus each contains the same series index.
|
||||
After an index optimization, each TSM file contains points from a minimum of series and there is little series overlap between files.
|
||||
Each TSM file thus has a smaller unique series index, instead of a duplicate of the full series list.
|
||||
In addition, all points from a particular series are contiguous in a TSM file rather than spread across multiple TSM files.
|
||||
- **Full Compactions** - Full compactions (level 4 compactions) run when a shard has become cold for writes for long time, or when deletes have occurred on the shard.
|
||||
Full compactions produce an optimal set of TSM files and include all optimizations from Level and Index Optimization compactions.
|
||||
Once a shard is fully compacted, no other compactions will run on it unless new writes or deletes are stored.
|
||||
|
||||
### Writes
|
||||
|
||||
Writes are appended to the current WAL segment and are also added to the Cache.
|
||||
Each WAL segment has a maximum size.
|
||||
Writes roll over to a new file once the current file fills up.
|
||||
The cache is also size bounded; snapshots are taken and WAL compactions are initiated when the cache becomes too full.
|
||||
If the inbound write rate exceeds the WAL compaction rate for a sustained period, the cache may become too full, in which case new writes will fail until the snapshot process catches up.
|
||||
|
||||
When WAL segments fill up and are closed, the Compactor snapshots the Cache and writes the data to a new TSM file.
|
||||
When the TSM file is successfully written and `fsync`'d, it is loaded and referenced by the FileStore.
|
||||
|
||||
### Updates
|
||||
|
||||
Updates (writing a newer value for a point that already exists) occur as normal writes.
|
||||
Since cached values overwrite existing values, newer writes take precedence.
|
||||
If a write would overwrite a point in a prior TSM file, the points are merged at query runtime and the newer write takes precedence.
|
||||
|
||||
|
||||
### Deletes
|
||||
|
||||
Deletes occur by writing a delete entry to the WAL for the measurement or series and then updating the Cache and FileStore.
|
||||
The Cache evicts all relevant entries.
|
||||
The FileStore writes a tombstone file for each TSM file that contains relevant data.
|
||||
These tombstone files are used at startup time to ignore blocks as well as during compactions to remove deleted entries.
|
||||
|
||||
Queries against partially deleted series are handled at query time until a compaction removes the data fully from the TSM files.
|
||||
|
||||
### Queries
|
||||
|
||||
When a query is executed by the storage engine, it is essentially a seek to a given time associated with a specific series key and field.
|
||||
First, we do a search on the data files to find the files that contain a time range matching the query as well containing matching series.
|
||||
|
||||
Once we have the data files selected, we next need to find the position in the file of the series key index entries.
|
||||
We run a binary search against each TSM index to find the location of its index blocks.
|
||||
|
||||
In common cases the blocks will not overlap across multiple TSM files and we can search the index entries linearly to find the start block from which to read.
|
||||
If there are overlapping blocks of time, the index entries are sorted to ensure newer writes will take precedence and that blocks can be processed in order during query execution.
|
||||
|
||||
When iterating over the index entries the blocks are read sequentially from the blocks section.
|
||||
The block is decompressed and we seek to the specific point.
|
||||
|
||||
|
||||
# The new InfluxDB storage engine: from LSM Tree to B+Tree and back again to create the Time Structured Merge Tree
|
||||
|
||||
Writing a new storage format should be a last resort.
|
||||
So how did InfluxData end up writing our own engine?
|
||||
InfluxData has experimented with many storage formats and found each lacking in some fundamental way.
|
||||
The performance requirements for InfluxDB are significant, and eventually overwhelm other storage systems.
|
||||
The 0.8 line of InfluxDB allowed multiple storage engines, including LevelDB, RocksDB, HyperLevelDB, and LMDB.
|
||||
The 0.9 line of InfluxDB used BoltDB as the underlying storage engine.
|
||||
This writeup is about the Time Structured Merge Tree storage engine that was released in 0.9.5 and is the only storage engine supported in InfluxDB 0.11+, including the entire 1.x family.
|
||||
|
||||
The properties of the time series data use case make it challenging for many existing storage engines.
|
||||
Over the course of InfluxDB development, InfluxData tried a few of the more popular options.
|
||||
We started with LevelDB, an engine based on LSM Trees, which are optimized for write throughput.
|
||||
After that we tried BoltDB, an engine based on a memory mapped B+Tree, which is optimized for reads.
|
||||
Finally, we ended up building our own storage engine that is similar in many ways to LSM Trees.
|
||||
|
||||
With our new storage engine we were able to achieve up to a 45x reduction in disk space usage from our B+Tree setup with even greater write throughput and compression than what we saw with LevelDB and its variants.
|
||||
This post will cover the details of that evolution and end with an in-depth look at our new storage engine and its inner workings.
|
||||
|
||||
## Properties of time series data
|
||||
|
||||
The workload of time series data is quite different from normal database workloads.
|
||||
There are a number of factors that conspire to make it very difficult to scale and remain performant:
|
||||
|
||||
* Billions of individual data points
|
||||
* High write throughput
|
||||
* High read throughput
|
||||
* Large deletes (data expiration)
|
||||
* Mostly an insert/append workload, very few updates
|
||||
|
||||
The first and most obvious problem is one of scale.
|
||||
In DevOps, IoT, or APM it is easy to collect hundreds of millions or billions of unique data points every day.
|
||||
|
||||
For example, let's say we have 200 VMs or servers running, with each server collecting an average of 100 measurements every 10 seconds.
|
||||
Given there are 86,400 seconds in a day, a single measurement will generate 8,640 points in a day per server.
|
||||
That gives us a total of 172,800,000 (`200 * 100 * 8,640`) individual data points per day.
|
||||
We find similar or larger numbers in sensor data use cases.
|
||||
|
||||
The volume of data means that the write throughput can be very high.
|
||||
We regularly get requests for setups than can handle hundreds of thousands of writes per second.
|
||||
Some larger companies will only consider systems that can handle millions of writes per second.
|
||||
|
||||
At the same time, time series data can be a high read throughput use case.
|
||||
It's true that if you're tracking 700,000 unique metrics or time series you can't hope to visualize all of them.
|
||||
That leads many people to think that you don't actually read most of the data that goes into the database.
|
||||
However, other than dashboards that people have up on their screens, there are automated systems for monitoring or combining the large volume of time series data with other types of data.
|
||||
|
||||
Inside InfluxDB, aggregate functions calculated on the fly may combine tens of thousands of distinct time series into a single view.
|
||||
Each one of those queries must read each aggregated data point, so for InfluxDB the read throughput is often many times higher than the write throughput.
|
||||
|
||||
Given that time series is mostly an append-only workload, you might think that it's possible to get great performance on a B+Tree.
|
||||
Appends in the keyspace are efficient and you can achieve greater than 100,000 per second.
|
||||
However, we have those appends happening in individual time series.
|
||||
So the inserts end up looking more like random inserts than append only inserts.
|
||||
|
||||
One of the biggest problems we found with time series data is that it's very common to delete all data after it gets past a certain age.
|
||||
The common pattern here is that users have high precision data that is kept for a short period of time like a few days or months.
|
||||
Users then downsample and aggregate that data into lower precision rollups that are kept around much longer.
|
||||
|
||||
The naive implementation would be to simply delete each record once it passes its expiration time.
|
||||
However, that means that once the first points written reach their expiration date, the system is processing just as many deletes as writes, which is something most storage engines aren't designed for.
|
||||
|
||||
Let's dig into the details of the two types of storage engines we tried and how these properties had a significant impact on our performance.
|
||||
|
||||
## LevelDB and log structured merge trees
|
||||
|
||||
When the InfluxDB project began, we picked LevelDB as the storage engine because we had used it for time series data storage in the product that was the precursor to InfluxDB.
|
||||
We knew that it had great properties for write throughput and everything seemed to "just work".
|
||||
|
||||
LevelDB is an implementation of a log structured merge tree (LSM tree) that was built as an open source project at Google.
|
||||
It exposes an API for a key-value store where the key space is sorted.
|
||||
This last part is important for time series data as it allowed us to quickly scan ranges of time as long as the timestamp was in the key.
|
||||
|
||||
LSM Trees are based on a log that takes writes and two structures known as Mem Tables and SSTables.
|
||||
These tables represent the sorted keyspace.
|
||||
SSTables are read only files that are continuously replaced by other SSTables that merge inserts and updates into the keyspace.
|
||||
|
||||
The two biggest advantages that LevelDB had for us were high write throughput and built in compression.
|
||||
However, as we learned more about what people needed with time series data, we encountered a few insurmountable challenges.
|
||||
|
||||
The first problem we had was that LevelDB doesn't support hot backups.
|
||||
If you want to do a safe backup of the database, you have to close it and then copy it.
|
||||
The LevelDB variants RocksDB and HyperLevelDB fix this problem, but there was another more pressing problem that we didn't think they could solve.
|
||||
|
||||
Our users needed a way to automatically manage data retention.
|
||||
That meant we needed deletes on a very large scale.
|
||||
In LSM Trees, a delete is as expensive, if not more so, than a write.
|
||||
A delete writes a new record known as a tombstone.
|
||||
After that queries merge the result set with any tombstones to purge the deleted data from the query return.
|
||||
Later, a compaction runs that removes the tombstone record and the underlying deleted record in the SSTable file.
|
||||
|
||||
To get around doing deletes, we split data across what we call shards, which are contiguous blocks of time.
|
||||
Shards would typically hold either one day or seven days worth of data.
|
||||
Each shard mapped to an underlying LevelDB.
|
||||
This meant that we could drop an entire day of data by just closing out the database and removing the underlying files.
|
||||
|
||||
Users of RocksDB may at this point bring up a feature called ColumnFamilies.
|
||||
When putting time series data into Rocks, it's common to split blocks of time into column families and then drop those when their time is up.
|
||||
It's the same general idea: create a separate area where you can just drop files instead of updating indexes when you delete a large block of data.
|
||||
Dropping a column family is a very efficient operation.
|
||||
However, column families are a fairly new feature and we had another use case for shards.
|
||||
|
||||
Organizing data into shards meant that it could be moved within a cluster without having to examine billions of keys.
|
||||
At the time of this writing, it was not possible to move a column family in one RocksDB to another.
|
||||
Old shards are typically cold for writes so moving them around would be cheap and easy.
|
||||
We would have the added benefit of having a spot in the keyspace that is cold for writes so it would be easier to do consistency checks later.
|
||||
|
||||
The organization of data into shards worked great for a while, until a large amount of data went into InfluxDB.
|
||||
LevelDB splits the data out over many small files.
|
||||
Having dozens or hundreds of these databases open in a single process ended up creating a big problem.
|
||||
Users that had six months or a year of data would run out of file handles.
|
||||
It's not something we found with the majority of users, but anyone pushing the database to its limits would hit this problem and we had no fix for it.
|
||||
There were simply too many file handles open.
|
||||
|
||||
## BoltDB and mmap B+Trees
|
||||
|
||||
After struggling with LevelDB and its variants for a year we decided to move over to BoltDB, a pure Golang database heavily inspired by LMDB, a mmap B+Tree database written in C.
|
||||
It has the same API semantics as LevelDB: a key value store where the keyspace is ordered.
|
||||
Many of our users were surprised.
|
||||
Our own posted tests of the LevelDB variants vs. LMDB (a mmap B+Tree) showed RocksDB as the best performer.
|
||||
|
||||
However, there were other considerations that went into this decision outside of the pure write performance.
|
||||
At this point our most important goal was to get to something stable that could be run in production and backed up.
|
||||
BoltDB also had the advantage of being written in pure Go, which simplified our build chain immensely and made it easy to build for other OSes and platforms.
|
||||
|
||||
The biggest win for us was that BoltDB used a single file as the database.
|
||||
At this point our most common source of bug reports were from people running out of file handles.
|
||||
Bolt solved the hot backup problem and the file limit problems all at the same time.
|
||||
|
||||
We were willing to take a hit on write throughput if it meant that we'd have a system that was more reliable and stable that we could build on.
|
||||
Our reasoning was that for anyone pushing really big write loads, they'd be running a cluster anyway.
|
||||
|
||||
We released versions 0.9.0 to 0.9.2 based on BoltDB.
|
||||
From a development perspective it was delightful.
|
||||
Clean API, fast and easy to build in our Go project, and reliable.
|
||||
However, after running for a while we found a big problem with write throughput.
|
||||
After the database got over a few GB, writes would start spiking IOPS.
|
||||
|
||||
Some users were able to get past this by putting InfluxDB on big hardware with near unlimited IOPS.
|
||||
However, most users are on VMs with limited resources in the cloud.
|
||||
We had to figure out a way to reduce the impact of writing a bunch of points into hundreds of thousands of series at a time.
|
||||
|
||||
With the 0.9.3 and 0.9.4 releases our plan was to put a write ahead log (WAL) in front of Bolt.
|
||||
That way we could reduce the number of random insertions into the keyspace.
|
||||
Instead, we'd buffer up multiple writes that were next to each other and then flush them at once.
|
||||
However, that only served to delay the problem.
|
||||
High IOPS still became an issue and it showed up very quickly for anyone operating at even moderate work loads.
|
||||
|
||||
However, our experience building the first WAL implementation in front of Bolt gave us the confidence we needed that the write problem could be solved.
|
||||
The performance of the WAL itself was fantastic, the index simply could not keep up.
|
||||
At this point we started thinking again about how we could create something similar to an LSM Tree that could keep up with our write load.
|
||||
|
||||
Thus was born the Time Structured Merge Tree.
|
|
@ -0,0 +1,51 @@
|
|||
---
|
||||
title: Time Series Index (TSI) overview
|
||||
description: >
|
||||
The Time Series Index (TSI) storage engine supports high cardinality in time series data.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Time Series Index (TSI) overview
|
||||
weight: 70
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
Find overview and background information on Time Series Index (TSI) in this topic. For detail, including how to enable and configure TSI, see [Time Series Index (TSI) details](/enterprise_influxdb/v1.10/concepts/tsi-details/).
|
||||
|
||||
## Overview
|
||||
|
||||
To support a large number of time series, that is, a very high cardinality in the number of unique time series that the database stores, InfluxData has added the new Time Series Index (TSI).
|
||||
InfluxData supports customers using InfluxDB with tens of millions of time series.
|
||||
InfluxData's goal, however, is to expand to hundreds of millions, and eventually billions.
|
||||
Using InfluxData's TSI storage engine, users should be able to have millions of unique time series.
|
||||
The goal is that the number of series should be unbounded by the amount of memory on the server hardware.
|
||||
Importantly, the number of series that exist in the database will have a negligible impact on database startup time.
|
||||
This work represents the most significant technical advancement in the database since InfluxData released the Time Series Merge Tree (TSM) storage engine in 2016.
|
||||
|
||||
## Background information
|
||||
|
||||
InfluxDB actually looks like two databases in one, a time series data store and an inverted index for the measurement, tag, and field metadata.
|
||||
|
||||
### Time-Structured Merge Tree (TSM)
|
||||
|
||||
The Time-Structured Merge Tree (TSM) engine solves the problem of getting maximum throughput, compression, and query speed for raw time series data.
|
||||
Up until TSI, the inverted index was an in-memory data structure that was built during startup of the database based on the data in TSM.
|
||||
This meant that for every measurement, tag key-value pair, and field name, there was a lookup table in-memory to map those bits of metadata to an underlying time series.
|
||||
For users with a high number of ephemeral series, memory utilization continued increasing as new time series were created.
|
||||
And, startup times increased since all of that data would have to be loaded onto the heap at start time.
|
||||
|
||||
> For details, see [TSM-based data storage and in-memory indexing](/enterprise_influxdb/v1.10/concepts/storage_engine/).
|
||||
|
||||
### Time Series Index (TSI)
|
||||
|
||||
The new time series index (TSI) moves the index to files on disk that we memory map.
|
||||
This means that we let the operating system handle being the Least Recently Used (LRU) memory.
|
||||
Much like the TSM engine for raw time series data we have a write-ahead log with an in-memory structure that gets merged at query time with the memory-mapped index.
|
||||
Background routines run constantly to compact the index into larger and larger files to avoid having to do too many index merges at query time.
|
||||
Under the covers, we’re using techniques like Robin Hood Hashing to do fast index lookups and HyperLogLog++ to keep sketches of cardinality estimates.
|
||||
The latter will give us the ability to add things to the query languages like the [SHOW CARDINALITY](/enterprise_influxdb/v1.10/query_language/spec#show-cardinality) queries.
|
||||
|
||||
### Issues solved by TSI and remaining to be solved
|
||||
|
||||
The primary issue that Time Series Index (TSI) addresses is ephemeral time series. Most frequently, this occurs in use cases that want to track per process metrics or per container metrics by putting identifiers in tags. For example, the [Heapster project for Kubernetes](https://github.com/kubernetes/heapster) does this. For series that are no longer hot for writes or queries, they won’t take up space in memory.
|
||||
|
||||
The issue that the Heapster project and similar use cases did not address is limiting the scope of data returned by the SHOW queries. We’ll have updates to the query language in the future to limit those results by time. We also don’t solve the problem of having all these series hot for reads and writes. For that problem, scale-out clustering is the solution. We’ll have to continue to optimize the query language and engine to work with large sets of series. We’ll need to add guard rails and limits into the language and eventually, add spill-to-disk query processing. That work will be on-going in every release of InfluxDB.
|
|
@ -0,0 +1,168 @@
|
|||
---
|
||||
title: Time Series Index (TSI) details
|
||||
description: Enable and understand the Time Series Index (TSI).
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Time Series Index (TSI) details
|
||||
weight: 80
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
When InfluxDB ingests data, we store not only the value but we also index the measurement and tag information so that it can be queried quickly.
|
||||
In earlier versions, index data could only be stored in-memory, however, that requires a lot of RAM and places an upper bound on the number of series a machine can hold.
|
||||
This upper bound is usually somewhere between 1 - 4 million series depending on the machine used.
|
||||
|
||||
The Time Series Index (TSI) was developed to allow us to go past that upper bound.
|
||||
TSI stores index data on disk so that we are no longer restricted by RAM.
|
||||
TSI uses the operating system's page cache to pull hot data into memory and let cold data rest on disk.
|
||||
|
||||
## Enable TSI
|
||||
|
||||
To enable TSI, set the following line in the InfluxDB configuration file (`influxdb.conf`):
|
||||
|
||||
```
|
||||
index-version = "tsi1"
|
||||
```
|
||||
|
||||
(Be sure to include the double quotes.)
|
||||
|
||||
### InfluxDB Enterprise
|
||||
|
||||
- To convert your data nodes to support TSI, see [Upgrade InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/upgrading/).
|
||||
|
||||
- For detail on configuration, see [Configure InfluxDB Enterprise clusters](/enterprise_influxdb/v1.10/administration/configuration/).
|
||||
|
||||
## Tooling
|
||||
|
||||
### `influx_inspect dumptsi`
|
||||
|
||||
If you are troubleshooting an issue with an index, you can use the `influx_inspect dumptsi` command.
|
||||
This command allows you to print summary statistics on an index, file, or a set of files.
|
||||
This command only works on one index at a time.
|
||||
|
||||
For details on this command, see [influx_inspect dumptsi](/enterprise_influxdb/v1.10/tools/influx_inspect/#dumptsi).
|
||||
|
||||
### `influx_inspect buildtsi`
|
||||
|
||||
If you want to convert an existing shard from an in-memory index to a TSI index, or if you have an existing TSI index which has become corrupt, you can use the `buildtsi` command to create the index from the underlying TSM data.
|
||||
If you have an existing TSI index that you want to rebuild, first delete the `index` directory within your shard.
|
||||
|
||||
This command works at the server-level but you can optionally add database, retention policy and shard filters to only apply to a subset of shards.
|
||||
|
||||
For details on this command, see [influx inspect buildtsi](/enterprise_influxdb/v1.10/tools/influx_inspect/#buildtsi).
|
||||
|
||||
|
||||
## Understanding TSI
|
||||
|
||||
### File organization
|
||||
|
||||
TSI (Time Series Index) is a log-structured merge tree-based database for InfluxDB series data.
|
||||
TSI is composed of several parts:
|
||||
|
||||
* **Index**: Contains the entire index dataset for a single shard.
|
||||
|
||||
* **Partition**: Contains a sharded partition of the data for a shard.
|
||||
|
||||
* **LogFile**: Contains newly written series as an in-memory index and is persisted as a WAL.
|
||||
|
||||
* **IndexFile**: Contains an immutable, memory-mapped index built from a LogFile or merged from two contiguous index files.
|
||||
|
||||
There is also a **SeriesFile** which contains a set of all series keys across the entire database.
|
||||
Each shard within the database shares the same series file.
|
||||
|
||||
### Writes
|
||||
|
||||
The following occurs when a write comes into the system:
|
||||
|
||||
1. Series is added to the series file or is looked up if it already exists. This returns an auto-incrementing series ID.
|
||||
2. The series is sent to the Index. The index maintains a roaring bitmap of existing series IDs and ignores series that have already been created.
|
||||
3. The series is hashed and sent to the appropriate Partition.
|
||||
4. The Partition writes the series as an entry to the LogFile.
|
||||
5. The LogFile writes the series to a write-ahead log file on disk and adds the series to a set of in-memory indexes.
|
||||
|
||||
### Compaction
|
||||
|
||||
Once the LogFile exceeds a threshold (5MB), then a new active log file is created and the previous one begins compacting into an IndexFile.
|
||||
This first index file is at level 1 (L1).
|
||||
The log file is considered level 0 (L0).
|
||||
|
||||
Index files can also be created by merging two smaller index files together.
|
||||
For example, if contiguous two L1 index files exist then they can be merged into an L2 index file.
|
||||
|
||||
### Reads
|
||||
|
||||
The index provides several API calls for retrieving sets of data such as:
|
||||
|
||||
* `MeasurementIterator()`: Returns a sorted list of measurement names.
|
||||
* `TagKeyIterator()`: Returns a sorted list of tag keys in a measurement.
|
||||
* `TagValueIterator()`: Returns a sorted list of tag values for a tag key.
|
||||
* `MeasurementSeriesIDIterator()`: Returns a sorted list of all series IDs for a measurement.
|
||||
* `TagKeySeriesIDIterator()`: Returns a sorted list of all series IDs for a tag key.
|
||||
* `TagValueSeriesIDIterator()`: Returns a sorted list of all series IDs for a tag value.
|
||||
|
||||
These iterators are all composable using several merge iterators.
|
||||
For each type of iterator (measurement, tag key, tag value, series id), there are multiple merge iterator types:
|
||||
|
||||
* **Merge**: Deduplicates items from two iterators.
|
||||
* **Intersect**: Returns only items that exist in two iterators.
|
||||
* **Difference**: Only returns items from first iterator that don't exist in the second iterator.
|
||||
|
||||
For example, a query with a WHERE clause of `region != 'us-west'` that operates across two shards will construct a set of iterators like this:
|
||||
|
||||
```
|
||||
DifferenceSeriesIDIterators(
|
||||
MergeSeriesIDIterators(
|
||||
Shard1.MeasurementSeriesIDIterator("m"),
|
||||
Shard2.MeasurementSeriesIDIterator("m"),
|
||||
),
|
||||
MergeSeriesIDIterators(
|
||||
Shard1.TagValueSeriesIDIterator("m", "region", "us-west"),
|
||||
Shard2.TagValueSeriesIDIterator("m", "region", "us-west"),
|
||||
),
|
||||
)
|
||||
```
|
||||
|
||||
### Log File Structure
|
||||
|
||||
The log file is simply structured as a list of LogEntry objects written to disk in sequential order. Log files are written until they reach 5MB and then they are compacted into index files.
|
||||
The entry objects in the log can be of any of the following types:
|
||||
|
||||
* AddSeries
|
||||
* DeleteSeries
|
||||
* DeleteMeasurement
|
||||
* DeleteTagKey
|
||||
* DeleteTagValue
|
||||
|
||||
The in-memory index on the log file tracks the following:
|
||||
|
||||
* Measurements by name
|
||||
* Tag keys by measurement
|
||||
* Tag values by tag key
|
||||
* Series by measurement
|
||||
* Series by tag value
|
||||
* Tombstones for series, measurements, tag keys, and tag values.
|
||||
|
||||
The log file also maintains bitsets for series ID existence and tombstones.
|
||||
These bitsets are merged with other log files and index files to regenerate the full index bitset on startup.
|
||||
|
||||
### Index File Structure
|
||||
|
||||
The index file is an immutable file that tracks similar information to the log file, but all data is indexed and written to disk so that it can be directly accessed from a memory-map.
|
||||
|
||||
The index file has the following sections:
|
||||
|
||||
* **TagBlocks:** Maintains an index of tag values for a single tag key.
|
||||
* **MeasurementBlock:** Maintains an index of measurements and their tag keys.
|
||||
* **Trailer:** Stores offset information for the file as well as HyperLogLog sketches for cardinality estimation.
|
||||
|
||||
### Manifest
|
||||
|
||||
The MANIFEST file is stored in the index directory and lists all the files that belong to the index and the order in which they should be accessed.
|
||||
This file is updated every time a compaction occurs.
|
||||
Any files that are in the directory that are not in the index file are index files that are in the process of being compacted.
|
||||
|
||||
### FileSet
|
||||
|
||||
A file set is an in-memory snapshot of the manifest that is obtained while the InfluxDB process is running.
|
||||
This is required to provide a consistent view of the index at a point-in-time.
|
||||
The file set also facilitates reference counting for all of its files so that no file will be deleted via compaction until all readers of the file are done with it.
|
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
title: InfluxDB Enterprise features
|
||||
description: Users, clustering, and other InfluxDB Enterprise features.
|
||||
aliases:
|
||||
- /enterprise/v1.8/features/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Enterprise features
|
||||
weight: 60
|
||||
---
|
||||
|
||||
InfluxDB Enterprise has additional capabilities that enhance
|
||||
[availability](#clustering),
|
||||
[scalability](#clustering), and
|
||||
[security](#security),
|
||||
and provide [eventual consistency](#eventual-consistency).
|
||||
|
||||
## Clustering
|
||||
|
||||
InfluxDB Enterprise runs on a network of independent servers, a *cluster*,
|
||||
to provide fault tolerance, availability, and horizontal scalability of the database.
|
||||
|
||||
While many InfluxDB Enterprise features are available
|
||||
when run with a single meta node and a single data node, this configuration does not take advantage of the clustering capablity
|
||||
or ensure high availablity.
|
||||
|
||||
Nodes can be added to an existing cluster to improve database performance for querying and writing data.
|
||||
Certain configurations (e.g., 3 meta and 2 data node) provide high-availability assurances
|
||||
while making certain tradeoffs in query peformance when compared to a single node.
|
||||
|
||||
Further increasing the number of nodes can improve performance in both respects.
|
||||
For example, a cluster with 4 data nodes and a [replication factor](https://docs.influxdata.com/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
|
||||
of 2 can support a higher volume of write traffic than a single node could.
|
||||
It can also support a higher *query* workload, as the data is replicated
|
||||
in two locations. Performance of the queries may be on par with a single
|
||||
node in cases where the query can be answered directly by the node which
|
||||
receives the query.
|
||||
|
||||
For more information on clustering, see [Clustering in InfluxDB Enterprise](/enterprise_influxdb/v1.10/concepts/clustering/).
|
||||
|
||||
## Security
|
||||
|
||||
Enterprise authorization uses an expanded set of [*16 user permissions and roles*](/enterprise_influxdb/v1.10/features/users/).
|
||||
(InfluxDB OSS only has `READ` and `WRITE` permissions.)
|
||||
Administrators can give users permission to read and write to databases,
|
||||
create and remove databases, rebalance a cluster, and manage particular resources.
|
||||
|
||||
Organizations can automate managing permissions with the [InfluxDB Enterprise Meta API](/enterprise_influxdb/v1.10/administration/manage/security/authentication_and_authorization-api/).
|
||||
|
||||
[Fine-grained authorization](/enterprise_influxdb/v1.10/guides/fine-grained-authorization/)
|
||||
for particular data is also available.
|
||||
|
||||
InfluxDB Enterprise can also use [LDAP for managing authentication](/enterprise_influxdb/v1.10/administration/manage/security/ldap/).
|
||||
|
||||
For FIPS compliance, InfluxDB Enterprise password hashing alogrithms are configurable.
|
||||
|
||||
{{% note %}}
|
||||
Kapacitor OSS can also delegate its LDAP and security setup to InfluxDB Enterprise.
|
||||
For details, see ["Set up InfluxDB Enterprise authorizations"](/{{< latest "kapacitor" >}}/administration/auth/influxdb-enterprise-auth/).
|
||||
{{% /note %}}
|
||||
|
||||
## Eventual consistency
|
||||
|
||||
### Hinted handoff
|
||||
|
||||
Hinted handoff (HH) is how InfluxDB Enterprise deals with data node outages while writes are happening.
|
||||
HH is essentially a durable disk based queue.
|
||||
|
||||
For more information, see ["Hinted handoff"](/enterprise_influxdb/v1.10/concepts/clustering/#hinted-handoff).
|
||||
|
||||
### Anti-entropy
|
||||
|
||||
Anti-entropy is an optional service to eliminate edge cases related to cluster consistency.
|
||||
|
||||
For more information, see ["Use Anti-Entropy service in InfluxDB Enterprise"](/enterprise_influxdb/v1.10/administration/anti-entropy/).
|
||||
|
||||
---
|
||||
|
||||
{{< children hlevel="h3" >}}
|
|
@ -0,0 +1,152 @@
|
|||
---
|
||||
title: InfluxDB Enterprise cluster features
|
||||
description: Overview of features related to InfluxDB Enterprise clustering.
|
||||
aliases:
|
||||
- /enterprise/v1.8/features/clustering-features/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Cluster features
|
||||
weight: 20
|
||||
parent: Enterprise features
|
||||
---
|
||||
|
||||
{{% note %}}
|
||||
_For an overview of InfluxDB Enterprise security features,
|
||||
see ["InfluxDB Enterprise features - Security"](/enterprise_influxdb/v1.10/features/#security).
|
||||
To secure your InfluxDB Enterprise cluster, see
|
||||
["Configure security"](/enterprise_influxdb/v1.10/administration/configure/security/).
|
||||
{{% /note %}}
|
||||
|
||||
## Entitlements
|
||||
|
||||
A valid license key is required in order to start `influxd-meta` or `influxd`.
|
||||
License keys restrict the number of data nodes that can be added to a cluster as well as the number of CPU cores a data node can use.
|
||||
Without a valid license, the process will abort startup.
|
||||
|
||||
Access your license expiration date with the `/debug/vars` endpoint.
|
||||
|
||||
{{< keep-url >}}
|
||||
```sh
|
||||
$ curl http://localhost:8086/debug/vars | jq '.entitlements'
|
||||
{
|
||||
"name": "entitlements",
|
||||
"tags": null,
|
||||
"values": {
|
||||
"licenseExpiry": "2022-02-15T00:00:00Z",
|
||||
"licenseType": "license-key"
|
||||
}
|
||||
}
|
||||
```
|
||||
{{% caption %}}
|
||||
This examples uses `curl` and [`jq`](https://stedolan.github.io/jq/).
|
||||
{{% /caption %}}
|
||||
|
||||
## Query management
|
||||
|
||||
Query management works cluster wide. Specifically, `SHOW QUERIES` and `KILL QUERY <ID>` on `"<host>"` can be run on any data node. `SHOW QUERIES` will report all queries running across the cluster and the node which is running the query.
|
||||
`KILL QUERY` can abort queries running on the local node or any other remote data node. For details on using the `SHOW QUERIES` and `KILL QUERY` on InfluxDB Enterprise clusters,
|
||||
see [Query Management](/enterprise_influxdb/v1.10/troubleshooting/query_management/).
|
||||
|
||||
## Subscriptions
|
||||
|
||||
Subscriptions used by Kapacitor work in a cluster. Writes to any node will be forwarded to subscribers across all supported subscription protocols.
|
||||
|
||||
## Continuous queries
|
||||
|
||||
### Configuration and operational considerations on a cluster
|
||||
|
||||
It is important to understand how to configure InfluxDB Enterprise and how this impacts the continuous queries (CQ) engine’s behavior:
|
||||
|
||||
- **Data node configuration** `[continuous queries]`
|
||||
[run-interval](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#run-interval)
|
||||
-- The interval at which InfluxDB checks to see if a CQ needs to run. Set this option to the lowest interval
|
||||
at which your CQs run. For example, if your most frequent CQ runs every minute, set run-interval to 1m.
|
||||
- **Meta node configuration** `[meta]`
|
||||
[lease-duration](/enterprise_influxdb/v1.10/administration/configure/config-meta-nodes/#lease-duration)
|
||||
-- The default duration of the leases that data nodes acquire from the meta nodes. Leases automatically expire after the
|
||||
lease-duration is met. Leases ensure that only one data node is running something at a given time. For example, Continuous
|
||||
Queries use a lease so that all data nodes aren’t running the same CQs at once.
|
||||
- **Execution time of CQs** – CQs are sequentially executed. Depending on the amount of work that they need to accomplish
|
||||
in order to complete, the configuration parameters mentioned above can have an impact on the observed behavior of CQs.
|
||||
|
||||
The CQ service is running on every node, but only a single node is granted exclusive access to execute CQs at any one time.
|
||||
However, every time the `run-interval` elapses (and assuming a node isn't currently executing CQs), a node attempts to
|
||||
acquire the CQ lease. By default the `run-interval` is one second – so the data nodes are aggressively checking to see
|
||||
if they can acquire the lease. On clusters where all CQs execute in an amount of time less than `lease-duration`
|
||||
(default is 1m), there's a good chance that the first data node to acquire the lease will still hold the lease when
|
||||
the `run-interval` elapses. Other nodes will be denied the lease and when the node holding the lease requests it again,
|
||||
the lease is renewed with the expiration extended to `lease-duration`. So in a typical situation, we observe that a
|
||||
single data node acquires the CQ lease and holds on to it. It effectively becomes the executor of CQs until it is
|
||||
recycled (for any reason).
|
||||
|
||||
Now consider the the following case, CQs take longer to execute than the `lease-duration`, so when the lease expires,
|
||||
~1 second later another data node requests and is granted the lease. The original holder of the lease is busily working
|
||||
on sequentially executing the list of CQs it was originally handed and the data node now holding the lease begins
|
||||
executing CQs from the top of the list.
|
||||
|
||||
Based on this scenario, it may appear that CQs are “executing in parallel” because multiple data nodes are
|
||||
essentially “rolling” sequentially through the registered CQs and the lease is rolling from node to node.
|
||||
The “long pole” here is effectively your most complex CQ – and it likely means that at some point all nodes
|
||||
are attempting to execute that same complex CQ (and likely competing for resources as they overwrite points
|
||||
generated by that query on each node that is executing it --- likely with some phased offset).
|
||||
|
||||
To avoid this behavior, and this is desirable because it reduces the overall load on your cluster,
|
||||
you should set the lease-duration to a value greater than the aggregate execution time for ALL the CQs that you are running.
|
||||
|
||||
Based on the current way in which CQs are configured to execute, the way to address parallelism is by using
|
||||
Kapacitor for the more complex CQs that you are attempting to run.
|
||||
[See Kapacitor as a continuous query engine](/{{< latest "kapacitor" >}}/guides/continuous_queries/).
|
||||
However, you can keep the more simplistic and highly performant CQs within the database –
|
||||
but ensure that the lease duration is greater than their aggregate execution time to ensure that
|
||||
“extra” load is not being unnecessarily introduced on your cluster.
|
||||
|
||||
|
||||
## PProf endpoints
|
||||
|
||||
Meta nodes expose the `/debug/pprof` endpoints for profiling and troubleshooting.
|
||||
|
||||
## Shard movement
|
||||
|
||||
* [Copy shard](/enterprise_influxdb/v1.10/tools/influxd-ctl/#copy-shard) support - copy a shard from one node to another
|
||||
* [Copy shard status](/enterprise_influxdb/v1.10/tools/influxd-ctl/#copy-shard-status) - query the status of a copy shard request
|
||||
* [Kill copy shard](/enterprise_influxdb/v1.10/tools/influxd-ctl/#kill-copy-shard) - kill a running shard copy
|
||||
* [Remove shard](/enterprise_influxdb/v1.10/tools/influxd-ctl/#remove-shard) - remove a shard from a node (this deletes data)
|
||||
* [Truncate shards](/enterprise_influxdb/v1.10/tools/influxd-ctl/#truncate-shards) - truncate all active shard groups and start new shards immediately (This is useful when adding nodes or changing replication factors.)
|
||||
|
||||
This functionality is exposed via an API on the meta service and through [`influxd-ctl` sub-commands](/enterprise_influxdb/v1.10/tools/influxd-ctl/).
|
||||
|
||||
## OSS conversion
|
||||
|
||||
Importing a OSS single server as the first data node is supported.
|
||||
|
||||
See [OSS to cluster migration](/enterprise_influxdb/v1.10/guides/migration/) for
|
||||
step-by-step instructions.
|
||||
|
||||
## Query routing
|
||||
|
||||
The query engine skips failed nodes that hold a shard needed for queries.
|
||||
If there is a replica on another node, it will retry on that node.
|
||||
|
||||
## Backup and restore
|
||||
|
||||
InfluxDB Enterprise clusters support backup and restore functionality starting with
|
||||
version 0.7.1.
|
||||
See [Backup and restore](/enterprise_influxdb/v1.10/administration/backup-and-restore/) for
|
||||
more information.
|
||||
|
||||
## Passive node setup (experimental)
|
||||
|
||||
Passive nodes act as load balancers--they accept write calls, perform shard lookup and RPC calls (on active data nodes), and distribute writes to active data nodes. They do not own shards or accept writes.
|
||||
|
||||
Use this feature when you have a replication factor (RF) of 2 or more and your CPU usage is consistently above 80 percent. Using the passive feature lets you scale a cluster when you can no longer vertically scale. Especially useful if you experience a large amount of hinted handoff growth. The passive node writes the hinted handoff queue to its own disk, and then communicates periodically with the appropriate node until it can send the queue contents there.
|
||||
|
||||
Best practices when using an active-passive node setup:
|
||||
- Use when you have a large cluster setup, generally 8 or more nodes.
|
||||
- Keep the ratio of active to passive nodes between 1:1 and 2:1.
|
||||
- Passive nodes should receive all writes.
|
||||
|
||||
For more inforrmation, see how to [add a passive node to a cluster](/enterprise_influxdb/v1.10/tools/influxd-ctl/#add-a-passive-node-to-the-cluster).
|
||||
|
||||
{{% note %}}
|
||||
**Note:** This feature is experimental and available only in InfluxDB Enterprise.
|
||||
{{% /note %}}
|
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
title: Flux data scripting language
|
||||
description: >
|
||||
Flux is a functional data scripting language designed for querying, analyzing, and acting on time series data.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Flux
|
||||
weight: 71
|
||||
v2: /influxdb/v2.0/query-data/get-started/
|
||||
---
|
||||
|
||||
Flux is a functional data scripting language designed for querying, analyzing, and acting on time series data.
|
||||
Its takes the power of [InfluxQL](/enterprise_influxdb/v1.10/query_language/spec/) and the functionality of [TICKscript](/{{< latest "kapacitor" >}}/tick/introduction/) and combines them into a single, unified syntax.
|
||||
|
||||
> Flux v0.65 is production-ready and included with [InfluxDB v1.8](/enterprise_influxdb/v1.10).
|
||||
> The InfluxDB v1.8 implementation of Flux is read-only and does not support
|
||||
> writing data back to InfluxDB.
|
||||
|
||||
## Flux design principles
|
||||
Flux is designed to be usable, readable, flexible, composable, testable, contributable, and shareable.
|
||||
Its syntax is largely inspired by [2018's most popular scripting language](https://insights.stackoverflow.com/survey/2018#technology),
|
||||
Javascript, and takes a functional approach to data exploration and processing.
|
||||
|
||||
The following example illustrates pulling data from a bucket (similar to an InfluxQL database) for the last five minutes,
|
||||
filtering that data by the `cpu` measurement and the `cpu=cpu-total` tag, windowing the data in 1 minute intervals,
|
||||
and calculating the average of each window:
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r.cpu == "cpu-total")
|
||||
|> aggregateWindow(every: 1m, fn: mean)
|
||||
```
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,157 @@
|
|||
---
|
||||
title: Execute Flux queries
|
||||
description: Use the InfluxDB CLI, API, and the Chronograf Data Explorer to execute Flux queries.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Execute Flux queries
|
||||
parent: Flux
|
||||
weight: 1
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/flux/guides/executing-queries/
|
||||
- /enterprise_influxdb/v1.10/flux/guides/execute-queries/
|
||||
v2: /influxdb/v2.0/query-data/execute-queries/
|
||||
---
|
||||
|
||||
There are multiple ways to execute Flux queries with InfluxDB Enterprise and Chronograf v1.8+.
|
||||
This guide covers the different options:
|
||||
|
||||
1. [Chronograf's Data Explorer](#chronograf-s-data-explorer)
|
||||
2. [Influx CLI](#influx-cli)
|
||||
3. [InfluxDB API](#influxdb-api)
|
||||
|
||||
> Before attempting these methods, make sure Flux is enabled by setting
|
||||
> `flux-enabled = true` in the `[http]` section of your InfluxDB configuration file.
|
||||
|
||||
## Chronograf's Data Explorer
|
||||
Chronograf v1.8+ supports Flux in its Data Explorer.
|
||||
Flux queries can be built, executed, and visualized from within the Chronograf user interface.
|
||||
|
||||
## Influx CLI
|
||||
To start an interactive Flux read-eval-print-loop (REPL) with the InfluxDB Enterprise 1.10+
|
||||
`influx` CLI, run the `influx` command with the following flags:
|
||||
|
||||
- `-type=flux`
|
||||
- `-path-prefix=/api/v2/query`
|
||||
|
||||
{{% note %}}
|
||||
If [authentication is enabled](/enterprise_influxdb/v1.10/administration/authentication_and_authorization)
|
||||
on your InfluxDB instance, use the `-username` flag to provide your InfluxDB username and
|
||||
the `-password` flag to provide your password.
|
||||
{{% /note %}}
|
||||
|
||||
##### Enter an interactive Flux REPL
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Auth](#)
|
||||
[Auth Enabled](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
influx -type=flux -path-prefix=/api/v2/query
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
influx -type=flux \
|
||||
-path-prefix=/api/v2/query \
|
||||
-username myuser \
|
||||
-password PasSw0rd
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
Any Flux query can be executed within the REPL.
|
||||
|
||||
### Submit a Flux query via parameter
|
||||
Flux queries can also be passed to the Flux REPL as a parameter using the `influx` CLI's `-type=flux` option and the `-execute` parameter.
|
||||
The accompanying string is executed as a Flux query and results are output in your terminal.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Auth](#)
|
||||
[Auth Enabled](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
influx -type=flux \
|
||||
-path-prefix=/api/v2/query \
|
||||
-execute '<flux query>'
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
influx -type=flux \
|
||||
-path-prefix=/api/v2/query \
|
||||
-username myuser \
|
||||
-password PasSw0rd \
|
||||
-execute '<flux query>'
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
### Submit a Flux query via via STDIN
|
||||
Flux queries an be piped into the `influx` CLI via STDIN.
|
||||
Query results are otuput in your terminal.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Auth](#)
|
||||
[Auth Enabled](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
echo '<flux query>' | influx -type=flux -path-prefix=/api/v2/query
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
echo '<flux query>' | influx -type=flux \
|
||||
-path-prefix=/api/v2/query \
|
||||
-username myuser \
|
||||
-password PasSw0rd
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
## InfluxDB API
|
||||
Flux can be used to query InfluxDB through InfluxDB's `/api/v2/query` endpoint.
|
||||
Queried data is returned in annotated CSV format.
|
||||
|
||||
In your request, set the following:
|
||||
|
||||
- `Accept` header to `application/csv`
|
||||
- `Content-type` header to `application/vnd.flux`
|
||||
- If [authentication is enabled](/enterprise_influxdb/v1.10/administration/authentication_and_authorization)
|
||||
on your InfluxDB instance, `Authorization` header to `Token <username>:<password>`
|
||||
|
||||
This allows you to POST the Flux query in plain text and receive the annotated CSV response.
|
||||
|
||||
Below is an example `curl` command that queries InfluxDB using Flux:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Auth](#)
|
||||
[Auth Enabled](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
curl -XPOST localhost:8086/api/v2/query -sS \
|
||||
-H 'Accept:application/csv' \
|
||||
-H 'Content-type:application/vnd.flux' \
|
||||
-d 'from(bucket:"telegraf")
|
||||
|> range(start:-5m)
|
||||
|> filter(fn:(r) => r._measurement == "cpu")'
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
curl -XPOST localhost:8086/api/v2/query -sS \
|
||||
-H 'Accept:application/csv' \
|
||||
-H 'Content-type:application/vnd.flux' \
|
||||
-H 'Authorization: Token <username>:<password>' \
|
||||
-d 'from(bucket:"telegraf")
|
||||
|> range(start:-5m)
|
||||
|> filter(fn:(r) => r._measurement == "cpu")'
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
|
@ -0,0 +1,368 @@
|
|||
---
|
||||
title: Flux vs InfluxQL
|
||||
description:
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Flux vs InfluxQL
|
||||
parent: Flux
|
||||
weight: 5
|
||||
---
|
||||
|
||||
Flux is an alternative to [InfluxQL](/enterprise_influxdb/v1.10/query_language/) and other SQL-like query languages for querying and analyzing data.
|
||||
Flux uses functional language patterns making it incredibly powerful, flexible, and able to overcome many of the limitations of InfluxQL.
|
||||
This article outlines many of the tasks possible with Flux but not InfluxQL and provides information about Flux and InfluxQL parity.
|
||||
|
||||
- [Possible with Flux](#possible-with-flux)
|
||||
- [InfluxQL and Flux parity](#influxql-and-flux-parity)
|
||||
|
||||
## Possible with Flux
|
||||
|
||||
- [Joins](#joins)
|
||||
- [Math across measurements](#math-across-measurements)
|
||||
- [Sort by tags](#sort-by-tags)
|
||||
- [Group by any column](#group-by-any-column)
|
||||
- [Window by calendar months and years](#window-by-calendar-months-and-years)
|
||||
- [Work with multiple data sources](#work-with-multiple-data-sources)
|
||||
- [DatePart-like queries](#datepart-like-queries)
|
||||
- [Pivot](#pivot)
|
||||
- [Histograms](#histograms)
|
||||
- [Covariance](#covariance)
|
||||
- [Cast booleans to integers](#cast-booleans-to-integers)
|
||||
- [String manipulation and data shaping](#string-manipulation-and-data-shaping)
|
||||
- [Work with geo-temporal data](#work-with-geo-temporal-data)
|
||||
|
||||
### Joins
|
||||
InfluxQL has never supported joins. They can be accomplished using [TICKscript](/{{< latest "kapacitor" >}}/tick/introduction/),
|
||||
but even TICKscript's join capabilities are limited.
|
||||
Flux's [`join()` function](/{{< latest "flux" >}}/stdlib/universe/join/) lets you
|
||||
to join data **from any bucket, any measurement, and on any columns** as long as
|
||||
each data set includes the columns on which they are to be joined.
|
||||
This opens the door for really powerful and useful operations.
|
||||
|
||||
```js
|
||||
dataStream1 = from(bucket: "bucket1")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "network" and r._field == "bytes-transferred")
|
||||
|
||||
dataStream2 = from(bucket: "bucket1")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "httpd" and r._field == "requests-per-sec")
|
||||
|
||||
join(tables: {d1: dataStream1, d2: dataStream2}, on: ["_time", "_stop", "_start", "host"])
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
_For an in-depth walkthrough of using the `join()` function, see [How to join data with Flux](/enterprise_influxdb/v1.10/flux/guides/join)._
|
||||
|
||||
---
|
||||
|
||||
### Math across measurements
|
||||
Being able to perform cross-measurement joins also allows you to run calculations using
|
||||
data from separate measurements – a highly requested feature from the InfluxData community.
|
||||
The example below takes two data streams from separate measurements, `mem` and `processes`,
|
||||
joins them, then calculates the average amount of memory used per running process:
|
||||
|
||||
```js
|
||||
// Memory used (in bytes)
|
||||
memUsed = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used")
|
||||
|
||||
// Total processes running
|
||||
procTotal = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "processes" and r._field == "total")
|
||||
|
||||
// Join memory used with total processes and calculate
|
||||
// the average memory (in MB) used for running processes.
|
||||
join(tables: {mem: memUsed, proc: procTotal}, on: ["_time", "_stop", "_start", "host"])
|
||||
|> map(fn: (r) => ({_time: r._time, _value: r._value_mem / r._value_proc / 1000000}))
|
||||
```
|
||||
|
||||
### Sort by tags
|
||||
InfluxQL's sorting capabilities are very limited, allowing you only to control the
|
||||
sort order of `time` using the `ORDER BY time` clause.
|
||||
Flux's [`sort()` function](/{{< latest "flux" >}}/stdlib/universe/sort) sorts records based on list of columns.
|
||||
Depending on the column type, records are sorted lexicographically, numerically, or chronologically.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -12h)
|
||||
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime")
|
||||
|> sort(columns: ["region", "host", "_value"])
|
||||
```
|
||||
|
||||
### Group by any column
|
||||
InfluxQL lets you group by tags or by time intervals, but nothing else.
|
||||
Flux lets you group by any column in the dataset, including `_value`.
|
||||
Use the Flux [`group()` function](/{{< latest "flux" >}}/stdlib/universe/group/)
|
||||
to define which columns to group data by.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start:-12h)
|
||||
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime" )
|
||||
|> group(columns:["host", "_value"])
|
||||
```
|
||||
|
||||
### Window by calendar months and years
|
||||
InfluxQL does not support windowing data by calendar months and years due to their varied lengths.
|
||||
Flux supports calendar month and year duration units (`1mo`, `1y`) and lets you
|
||||
window and aggregate data by calendar month and year.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start:-1y)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
|
||||
|> aggregateWindow(every: 1mo, fn: mean)
|
||||
```
|
||||
|
||||
### Work with multiple data sources
|
||||
InfluxQL can only query data stored in InfluxDB.
|
||||
Flux can query data from other data sources such as CSV, PostgreSQL, MySQL, Google BigTable, and more.
|
||||
Join that data with data in InfluxDB to enrich query results.
|
||||
|
||||
- [Flux CSV package](/{{< latest "flux" >}}/stdlib/csv/)
|
||||
- [Flux SQL package](/{{< latest "flux" >}}/stdlib/sql/)
|
||||
- [Flux BigTable package](/{{< latest "flux" >}}/stdlib/experimental/bigtable/)
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
import "csv"
|
||||
import "sql"
|
||||
|
||||
csvData = csv.from(csv: rawCSV)
|
||||
sqlData = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://user:password@localhost",
|
||||
query: "SELECT * FROM example_table",
|
||||
)
|
||||
data = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -24h)
|
||||
|> filter(fn: (r) => r._measurement == "sensor")
|
||||
|
||||
auxData = join(tables: {csv: csvData, sql: sqlData}, on: ["sensor_id"])
|
||||
enrichedData = join(tables: {data: data, aux: auxData}, on: ["sensor_id"])
|
||||
|
||||
enrichedData
|
||||
|> yield(name: "enriched_data")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
_For an in-depth walkthrough of querying SQL data, see [Query SQL data sources](/enterprise_influxdb/v1.10/flux/guides/sql)._
|
||||
|
||||
---
|
||||
|
||||
### DatePart-like queries
|
||||
InfluxQL doesn't support DatePart-like queries that only return results during specified hours of the day.
|
||||
The Flux [`hourSelection` function](/{{< latest "flux" >}}/stdlib/universe/hourselection/)
|
||||
returns only data with time values in a specified hour range.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r.cpu == "cpu-total")
|
||||
|> hourSelection(start: 9, stop: 17)
|
||||
```
|
||||
|
||||
### Pivot
|
||||
Pivoting data tables has never been supported in InfluxQL.
|
||||
The Flux [`pivot()` function](/{{< latest "flux" >}}/stdlib/universe/pivot) provides the ability
|
||||
to pivot data tables by specifying `rowKey`, `columnKey`, and `valueColumn` parameters.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r.cpu == "cpu-total")
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
```
|
||||
|
||||
### Histograms
|
||||
The ability to generate histograms has been a highly requested feature for InfluxQL, but has never been supported.
|
||||
Flux's [`histogram()` function](/{{< latest "flux" >}}/stdlib/universe/histogram) uses input
|
||||
data to generate a cumulative histogram with support for other histogram types coming in the future.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> histogram(buckets: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100,])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
_For an example of using Flux to create a cumulative histogram, see [Create histograms](/enterprise_influxdb/v1.10/flux/guides/histograms)._
|
||||
|
||||
---
|
||||
|
||||
### Covariance
|
||||
Flux provides functions for simple covariance calculation.
|
||||
The [`covariance()` function](/{{< latest "flux" >}}/stdlib/universe/covariance)
|
||||
calculates the covariance between two columns and the [`cov()` function](/{{< latest "flux" >}}/stdlib/universe/cov)
|
||||
calculates the covariance between two data streams.
|
||||
|
||||
###### Covariance between two columns
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> covariance(columns: ["x", "y"])
|
||||
```
|
||||
|
||||
###### Covariance between two streams of data
|
||||
```js
|
||||
table1 = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "measurement_1")
|
||||
|
||||
table2 = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "measurement_2")
|
||||
|
||||
cov(x: table1, y: table2, on: ["_time", "_field"])
|
||||
```
|
||||
|
||||
### Cast booleans to integers
|
||||
InfluxQL supports type casting, but only for numeric data types (floats to integers and vice versa).
|
||||
[Flux type conversion functions](/{{< latest "flux" >}}/stdlib/universe/type-conversions/)
|
||||
provide much broader support for type conversions and let you perform some long-requested
|
||||
operations like casting a boolean values to integers.
|
||||
|
||||
##### Cast boolean field values to integers
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "m" and r._field == "bool_field")
|
||||
|> toInt()
|
||||
```
|
||||
|
||||
### String manipulation and data shaping
|
||||
InfluxQL doesn't support string manipulation when querying data.
|
||||
The [Flux Strings package](/{{< latest "flux" >}}/stdlib/strings/) is a collection of functions that operate on string data.
|
||||
When combined with the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map/),
|
||||
functions in the string package allow for operations like string sanitization and normalization.
|
||||
|
||||
```js
|
||||
import "strings"
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "weather" and r._field == "temp")
|
||||
|> map(
|
||||
fn: (r) => ({
|
||||
r with
|
||||
location: strings.toTitle(v: r.location),
|
||||
sensor: strings.replaceAll(v: r.sensor, t: " ", u: "-"),
|
||||
status: strings.substring(v: r.status, start: 0, end: 8),
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
### Work with geo-temporal data
|
||||
InfluxQL doesn't provide functionality for working with geo-temporal data.
|
||||
The [Flux Geo package](/{{< latest "flux" >}}/stdlib/experimental/geo/) is a collection of functions that
|
||||
let you shape, filter, and group geo-temporal data.
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
from(bucket: "geo/autogen")
|
||||
|> range(start: -1w)
|
||||
|> filter(fn: (r) => r._measurement == "taxi")
|
||||
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 20)
|
||||
|> geo.filterRows(region: {lat: 40.69335938, lon: -73.30078125, radius: 20.0}, strict: true)
|
||||
|> geo.asTracks(groupBy: ["fare-id"])
|
||||
```
|
||||
|
||||
|
||||
## InfluxQL and Flux parity
|
||||
Flux is working towards complete parity with InfluxQL and new functions are being added to that end.
|
||||
The table below shows InfluxQL statements, clauses, and functions along with their equivalent Flux functions.
|
||||
|
||||
_For a complete list of Flux functions, [view all Flux functions](/{{< latest "flux" >}}/stdlib/all-functions)._
|
||||
|
||||
### InfluxQL and Flux parity
|
||||
|
||||
| InfluxQL | Flux Functions |
|
||||
| :------------------------------------------------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| [SELECT](/enterprise_influxdb/v1.10/query_language/explore-data/#the-basic-select-statement) | [filter()](/{{< latest "flux" >}}/stdlib/universe/filter/) |
|
||||
| [WHERE](/enterprise_influxdb/v1.10/query_language/explore-data/#the-where-clause) | [filter()](/{{< latest "flux" >}}/stdlib/universe/filter/), [range()](/{{< latest "flux" >}}/stdlib/universe/range/) |
|
||||
| [GROUP BY](/enterprise_influxdb/v1.10/query_language/explore-data/#the-group-by-clause) | [group()](/{{< latest "flux" >}}/stdlib/universe/group/) |
|
||||
| [INTO](/enterprise_influxdb/v1.10/query_language/explore-data/#the-into-clause) | [to()](/{{< latest "flux" >}}/stdlib/universe/to/) <span><a style="color:orange" href="#footnote">*</a></span> |
|
||||
| [ORDER BY](/enterprise_influxdb/v1.10/query_language/explore-data/#order-by-time-desc) | [sort()](/{{< latest "flux" >}}/stdlib/universe/sort/) |
|
||||
| [LIMIT](/enterprise_influxdb/v1.10/query_language/explore-data/#the-limit-clause) | [limit()](/{{< latest "flux" >}}/stdlib/universe/limit/) |
|
||||
| [SLIMIT](/enterprise_influxdb/v1.10/query_language/explore-data/#the-slimit-clause) | -- |
|
||||
| [OFFSET](/enterprise_influxdb/v1.10/query_language/explore-data/#the-offset-clause) | -- |
|
||||
| [SOFFSET](/enterprise_influxdb/v1.10/query_language/explore-data/#the-soffset-clause) | -- |
|
||||
| [SHOW DATABASES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-databases) | [buckets()](/{{< latest "flux" >}}/stdlib/universe/buckets/) |
|
||||
| [SHOW MEASUREMENTS](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-measurements) | [v1.measurements](/{{< latest "flux" >}}/stdlib/influxdb-v1/measurements) |
|
||||
| [SHOW FIELD KEYS](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-field-keys) | [keys()](/{{< latest "flux" >}}/stdlib/universe/keys/) |
|
||||
| [SHOW RETENTION POLICIES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-retention-policies) | [buckets()](/{{< latest "flux" >}}/stdlib/universe/buckets/) |
|
||||
| [SHOW TAG KEYS](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-tag-keys) | [v1.tagKeys()](/{{< latest "flux" >}}/stdlib/influxdb-v1/tagkeys), [v1.measurementTagKeys()](/{{< latest "flux" >}}/stdlib/influxdb-v1/measurementtagkeys) |
|
||||
| [SHOW TAG VALUES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-tag-values) | [v1.tagValues()](/{{< latest "flux" >}}/stdlib/influxdb-v1/tagvalues), [v1.measurementTagValues()](/{{< latest "flux" >}}/stdlib/influxdb-v1/measurementtagvalues) |
|
||||
| [SHOW SERIES](/enterprise_influxdb/v1.10/query_language/explore-schema/#show-series) | -- |
|
||||
| [CREATE DATABASE](/enterprise_influxdb/v1.10/query_language/manage-database/#create-database) | -- |
|
||||
| [DROP DATABASE](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-a-database-with-drop-database) | -- |
|
||||
| [DROP SERIES](/enterprise_influxdb/v1.10/query_language/manage-database/#drop-series-from-the-index-with-drop-series) | -- |
|
||||
| [DELETE](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-series-with-delete) | -- |
|
||||
| [DROP MEASUREMENT](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-measurements-with-drop-measurement) | -- |
|
||||
| [DROP SHARD](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-a-shard-with-drop-shard) | -- |
|
||||
| [CREATE RETENTION POLICY](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy) | -- |
|
||||
| [ALTER RETENTION POLICY](/enterprise_influxdb/v1.10/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy) | -- |
|
||||
| [DROP RETENTION POLICY](/enterprise_influxdb/v1.10/query_language/manage-database/#delete-retention-policies-with-drop-retention-policy) | -- |
|
||||
| [COUNT](/enterprise_influxdb/v1.10/query_language/functions#count) | [count()](/{{< latest "flux" >}}/stdlib/universe/count/) |
|
||||
| [DISTINCT](/enterprise_influxdb/v1.10/query_language/functions#distinct) | [distinct()](/{{< latest "flux" >}}/stdlib/universe/distinct/) |
|
||||
| [INTEGRAL](/enterprise_influxdb/v1.10/query_language/functions#integral) | [integral()](/{{< latest "flux" >}}/stdlib/universe/integral/) |
|
||||
| [MEAN](/enterprise_influxdb/v1.10/query_language/functions#mean) | [mean()](/{{< latest "flux" >}}/stdlib/universe/mean/) |
|
||||
| [MEDIAN](/enterprise_influxdb/v1.10/query_language/functions#median) | [median()](/{{< latest "flux" >}}/stdlib/universe/median/) |
|
||||
| [MODE](/enterprise_influxdb/v1.10/query_language/functions#mode) | [mode()](/{{< latest "flux" >}}/stdlib/universe/mode/) |
|
||||
| [SPREAD](/enterprise_influxdb/v1.10/query_language/functions#spread) | [spread()](/{{< latest "flux" >}}/stdlib/universe/spread/) |
|
||||
| [STDDEV](/enterprise_influxdb/v1.10/query_language/functions#stddev) | [stddev()](/{{< latest "flux" >}}/stdlib/universe/stddev/) |
|
||||
| [SUM](/enterprise_influxdb/v1.10/query_language/functions#sum) | [sum()](/{{< latest "flux" >}}/stdlib/universe/sum/) |
|
||||
| [BOTTOM](/enterprise_influxdb/v1.10/query_language/functions#bottom) | [bottom()](/{{< latest "flux" >}}/stdlib/universe/bottom/) |
|
||||
| [FIRST](/enterprise_influxdb/v1.10/query_language/functions#first) | [first()](/{{< latest "flux" >}}/stdlib/universe/first/) |
|
||||
| [LAST](/enterprise_influxdb/v1.10/query_language/functions#last) | [last()](/{{< latest "flux" >}}/stdlib/universe/last/) |
|
||||
| [MAX](/enterprise_influxdb/v1.10/query_language/functions#max) | [max()](/{{< latest "flux" >}}/stdlib/universe/max/) |
|
||||
| [MIN](/enterprise_influxdb/v1.10/query_language/functions#min) | [min()](/{{< latest "flux" >}}/stdlib/universe/min/) |
|
||||
| [PERCENTILE](/enterprise_influxdb/v1.10/query_language/functions#percentile) | [quantile()](/{{< latest "flux" >}}/stdlib/universe/quantile/) |
|
||||
| [SAMPLE](/enterprise_influxdb/v1.10/query_language/functions#sample) | [sample()](/{{< latest "flux" >}}/stdlib/universe/sample/) |
|
||||
| [TOP](/enterprise_influxdb/v1.10/query_language/functions#top) | [top()](/{{< latest "flux" >}}/stdlib/universe/top/) |
|
||||
| [ABS](/enterprise_influxdb/v1.10/query_language/functions#abs) | [math.abs()](/{{< latest "flux" >}}/stdlib/math/abs/) |
|
||||
| [ACOS](/enterprise_influxdb/v1.10/query_language/functions#acos) | [math.acos()](/{{< latest "flux" >}}/stdlib/math/acos/) |
|
||||
| [ASIN](/enterprise_influxdb/v1.10/query_language/functions#asin) | [math.asin()](/{{< latest "flux" >}}/stdlib/math/asin/) |
|
||||
| [ATAN](/enterprise_influxdb/v1.10/query_language/functions#atan) | [math.atan()](/{{< latest "flux" >}}/stdlib/math/atan/) |
|
||||
| [ATAN2](/enterprise_influxdb/v1.10/query_language/functions#atan2) | [math.atan2()](/{{< latest "flux" >}}/stdlib/math/atan2/) |
|
||||
| [CEIL](/enterprise_influxdb/v1.10/query_language/functions#ceil) | [math.ceil()](/{{< latest "flux" >}}/stdlib/math/ceil/) |
|
||||
| [COS](/enterprise_influxdb/v1.10/query_language/functions#cos) | [math.cos()](/{{< latest "flux" >}}/stdlib/math/cos/) |
|
||||
| [CUMULATIVE_SUM](/enterprise_influxdb/v1.10/query_language/functions#cumulative-sum) | [cumulativeSum()](/{{< latest "flux" >}}/stdlib/universe/cumulativesum/) |
|
||||
| [DERIVATIVE](/enterprise_influxdb/v1.10/query_language/functions#derivative) | [derivative()](/{{< latest "flux" >}}/stdlib/universe/derivative/) |
|
||||
| [DIFFERENCE](/enterprise_influxdb/v1.10/query_language/functions#difference) | [difference()](/{{< latest "flux" >}}/stdlib/universe/difference/) |
|
||||
| [ELAPSED](/enterprise_influxdb/v1.10/query_language/functions#elapsed) | [elapsed()](/{{< latest "flux" >}}/stdlib/universe/elapsed/) |
|
||||
| [EXP](/enterprise_influxdb/v1.10/query_language/functions#exp) | [math.exp()](/{{< latest "flux" >}}/stdlib/math/exp/) |
|
||||
| [FLOOR](/enterprise_influxdb/v1.10/query_language/functions#floor) | [math.floor()](/{{< latest "flux" >}}/stdlib/math/floor/) |
|
||||
| [HISTOGRAM](/enterprise_influxdb/v1.10/query_language/functions#histogram) | [histogram()](/{{< latest "flux" >}}/stdlib/universe/histogram/) |
|
||||
| [LN](/enterprise_influxdb/v1.10/query_language/functions#ln) | [math.log()](/{{< latest "flux" >}}/stdlib/math/log/) |
|
||||
| [LOG](/enterprise_influxdb/v1.10/query_language/functions#log) | [math.logb()](/{{< latest "flux" >}}/stdlib/math/logb/) |
|
||||
| [LOG2](/enterprise_influxdb/v1.10/query_language/functions#log2) | [math.log2()](/{{< latest "flux" >}}/stdlib/math/log2/) |
|
||||
| [LOG10](/enterprise_influxdb/v1.10/query_language/functions/#log10) | [math.log10()](/{{< latest "flux" >}}/stdlib/math/log10/) |
|
||||
| [MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#moving-average) | [movingAverage()](/{{< latest "flux" >}}/stdlib/universe/movingaverage/) |
|
||||
| [NON_NEGATIVE_DERIVATIVE](/enterprise_influxdb/v1.10/query_language/functions#non-negative-derivative) | [derivative(nonNegative:true)](/{{< latest "flux" >}}/stdlib/universe/derivative/) |
|
||||
| [NON_NEGATIVE_DIFFERENCE](/enterprise_influxdb/v1.10/query_language/functions#non-negative-difference) | [difference(nonNegative:true)](/{{< latest "flux" >}}/stdlib/universe/derivative/) |
|
||||
| [POW](/enterprise_influxdb/v1.10/query_language/functions#pow) | [math.pow()](/{{< latest "flux" >}}/stdlib/math/pow/) |
|
||||
| [ROUND](/enterprise_influxdb/v1.10/query_language/functions#round) | [math.round()](/{{< latest "flux" >}}/stdlib/math/round/) |
|
||||
| [SIN](/enterprise_influxdb/v1.10/query_language/functions#sin) | [math.sin()](/{{< latest "flux" >}}/stdlib/math/sin/) |
|
||||
| [SQRT](/enterprise_influxdb/v1.10/query_language/functions#sqrt) | [math.sqrt()](/{{< latest "flux" >}}/stdlib/math/sqrt/) |
|
||||
| [TAN](/enterprise_influxdb/v1.10/query_language/functions#tan) | [math.tan()](/{{< latest "flux" >}}/stdlib/math/tan/) |
|
||||
| [HOLT_WINTERS](/enterprise_influxdb/v1.10/query_language/functions#holt-winters) | [holtWinters()](/{{< latest "flux" >}}/stdlib/universe/holtwinters/) |
|
||||
| [CHANDE_MOMENTUM_OSCILLATOR](/enterprise_influxdb/v1.10/query_language/functions#chande-momentum-oscillator) | [chandeMomentumOscillator()](/{{< latest "flux" >}}/stdlib/universe/chandemomentumoscillator/) |
|
||||
| [EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#exponential-moving-average) | [exponentialMovingAverage()](/{{< latest "flux" >}}/stdlib/universe/exponentialmovingaverage/) |
|
||||
| [DOUBLE_EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#double-exponential-moving-average) | [doubleEMA()](/{{< latest "flux" >}}/stdlib/universe/doubleema/) |
|
||||
| [KAUFMANS_EFFICIENCY_RATIO](/enterprise_influxdb/v1.10/query_language/functions#kaufmans-efficiency-ratio) | [kaufmansER()](/{{< latest "flux" >}}/stdlib/universe/kaufmanser/) |
|
||||
| [KAUFMANS_ADAPTIVE_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#kaufmans-adaptive-moving-average) | [kaufmansAMA()](/{{< latest "flux" >}}/stdlib/universe/kaufmansama/) |
|
||||
| [TRIPLE_EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.10/query_language/functions#triple-exponential-moving-average) | [tripleEMA()](/{{< latest "flux" >}}/stdlib/universe/tripleema/) |
|
||||
| [TRIPLE_EXPONENTIAL_DERIVATIVE](/enterprise_influxdb/v1.10/query_language/functions#triple-exponential-derivative) | [tripleExponentialDerivative()](/{{< latest "flux" >}}/stdlib/universe/tripleexponentialderivative/) |
|
||||
| [RELATIVE_STRENGTH_INDEX](/enterprise_influxdb/v1.10/query_language/functions#relative-strength-index) | [relativeStrengthIndex()](/{{< latest "flux" >}}/stdlib/universe/relativestrengthindex/) |
|
||||
|
||||
_<span style="font-size:.9rem" id="footnote"><span style="color:orange">*</span> The <code>to()</code> function only writes to InfluxDB 2.0.</span>_
|
|
@ -0,0 +1,115 @@
|
|||
---
|
||||
title: Get started with Flux
|
||||
description: >
|
||||
Get started with Flux, InfluxData's new functional data scripting language.
|
||||
This step-by-step guide will walk you through the basics and get you on your way.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Get started with Flux
|
||||
identifier: get-started
|
||||
parent: Flux
|
||||
weight: 2
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/flux/getting-started/
|
||||
- /enterprise_influxdb/v1.10/flux/introduction/getting-started/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/
|
||||
v2: /influxdb/v2.0/query-data/get-started/
|
||||
---
|
||||
|
||||
Flux is InfluxData's new functional data scripting language designed for querying,
|
||||
analyzing, and acting on data.
|
||||
|
||||
This multi-part getting started guide walks through important concepts related to Flux.
|
||||
It covers querying time series data from InfluxDB using Flux, and introduces Flux syntax and functions.
|
||||
|
||||
## What you will need
|
||||
|
||||
##### InfluxDB v1.8+
|
||||
Flux v0.65 is built into InfluxDB v1.8 and can be used to query data stored in InfluxDB.
|
||||
|
||||
---
|
||||
|
||||
_For information about downloading and installing InfluxDB, see [InfluxDB installation](/enterprise_influxdb/v1.10/introduction/installation)._
|
||||
|
||||
---
|
||||
|
||||
##### Chronograf v1.8+
|
||||
**Not required but strongly recommended**.
|
||||
Chronograf v1.8's Data Explorer provides a user interface (UI) for writing Flux scripts and visualizing results.
|
||||
Dashboards in Chronograf v1.8+ also support Flux queries.
|
||||
|
||||
---
|
||||
|
||||
_For information about downloading and installing Chronograf, see [Chronograf installation](/{{< latest "chronograf" >}}/introduction/installation)._
|
||||
|
||||
---
|
||||
|
||||
## Key concepts
|
||||
Flux introduces important new concepts you should understand as you get started.
|
||||
|
||||
### Buckets
|
||||
Flux introduces "buckets," a new data storage concept for InfluxDB.
|
||||
A **bucket** is a named location where data is stored that has a retention policy.
|
||||
It's similar to an InfluxDB v1.x "database," but is a combination of both a database and a retention policy.
|
||||
When using multiple retention policies, each retention policy is treated as is its own bucket.
|
||||
|
||||
Flux's [`from()` function](/{{< latest "flux" >}}/stdlib/universe/from), which defines an InfluxDB data source, requires a `bucket` parameter.
|
||||
When using Flux with InfluxDB v1.x, use the following bucket naming convention which combines
|
||||
the database name and retention policy into a single bucket name:
|
||||
|
||||
###### InfluxDB v1.x bucket naming convention
|
||||
```js
|
||||
// Pattern
|
||||
from(bucket:"<database>/<retention-policy>")
|
||||
|
||||
// Example
|
||||
from(bucket:"telegraf/autogen")
|
||||
```
|
||||
|
||||
### Pipe-forward operator
|
||||
Flux uses pipe-forward operators (`|>`) extensively to chain operations together.
|
||||
After each function or operation, Flux returns a table or collection of tables containing data.
|
||||
The pipe-forward operator pipes those tables into the next function or operation where
|
||||
they are further processed or manipulated.
|
||||
|
||||
### Tables
|
||||
Flux structures all data in tables.
|
||||
When data is streamed from data sources, Flux formats it as annotated comma-separated values (CSV), representing tables.
|
||||
Functions then manipulate or process them and output new tables.
|
||||
This makes it easy to chain together functions to build sophisticated queries.
|
||||
|
||||
#### Group keys
|
||||
Every table has a **group key** which describes the contents of the table.
|
||||
It's a list of columns for which every row in the table will have the same value.
|
||||
Columns with unique values in each row are **not** part of the group key.
|
||||
|
||||
As functions process and transform data, each modifies the group keys of output tables.
|
||||
Understanding how tables and group keys are modified by functions is key to properly
|
||||
shaping your data for the desired output.
|
||||
|
||||
###### Example group key
|
||||
```js
|
||||
[_start, _stop, _field, _measurement, host]
|
||||
```
|
||||
|
||||
Note that `_time` and `_value` are excluded from the example group key because they
|
||||
are unique to each row.
|
||||
|
||||
## Tools for working with Flux
|
||||
|
||||
You have multiple [options for writing and running Flux queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/),
|
||||
but as you're getting started, we recommend using the following:
|
||||
|
||||
### Chronograf's Data Explorer
|
||||
Chronograf's Data Explorer makes it easy to write your first Flux script and visualize the results.
|
||||
To use Chronograf's Flux UI, open the **Data Explorer** and to the right of the source
|
||||
dropdown above the graph placeholder, select **Flux** as the source type.
|
||||
|
||||
This will provide **Schema**, **Script**, and **Functions** panes.
|
||||
The Schema pane allows you to explore your data.
|
||||
The Script pane is where you write your Flux script.
|
||||
The Functions pane provides a list of functions available in your Flux queries.
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn next" href="/enterprise_influxdb/v1.10/flux/get-started/query-influxdb/">Query InfluxDB with Flux</a>
|
||||
</div>
|
|
@ -0,0 +1,130 @@
|
|||
---
|
||||
title: Query InfluxDB with Flux
|
||||
description: Learn the basics of using Flux to query data from InfluxDB.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Query InfluxDB
|
||||
parent: get-started
|
||||
weight: 1
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/flux/getting-started/query-influxdb/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/query-influxdb/
|
||||
v2: /influxdb/v2.0/query-data/get-started/query-influxdb/
|
||||
---
|
||||
|
||||
This guide walks through the basics of using Flux to query data from InfluxDB.
|
||||
_**If you haven't already, make sure to install InfluxDB v1.8+, [enable Flux](/enterprise_influxdb/v1.10/flux/installation),
|
||||
and choose a [tool for writing Flux queries](/enterprise_influxdb/v1.10/flux/get-started#tools-for-working-with-flux).**_
|
||||
|
||||
The following queries can be executed using any of the methods described in
|
||||
[Execute Flux queries](/enterprise_influxdb/v1.10/flux/execute-queries/).
|
||||
Be sure to provide your InfluxDB Enterprise authorization credentials with each method.
|
||||
|
||||
Every Flux query needs the following:
|
||||
|
||||
1. [A data source](#1-define-your-data-source)
|
||||
2. [A time range](#2-specify-a-time-range)
|
||||
3. [Data filters](#3-filter-your-data)
|
||||
|
||||
|
||||
## 1. Define your data source
|
||||
Flux's [`from()`](/{{< latest "flux" >}}/stdlib/universe/from) function defines an InfluxDB data source.
|
||||
It requires a [`bucket`](/enterprise_influxdb/v1.10/flux/get-started/#buckets) parameter.
|
||||
For this example, use `telegraf/autogen`, a combination of the default database and retention policy provided by the TICK stack.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
```
|
||||
|
||||
## 2. Specify a time range
|
||||
Flux requires a time range when querying time series data.
|
||||
"Unbounded" queries are very resource-intensive and as a protective measure,
|
||||
Flux will not query the database without a specified range.
|
||||
|
||||
Use the pipe-forward operator (`|>`) to pipe data from your data source into the [`range()`](/{{< latest "flux" >}}/stdlib/universe/range)
|
||||
function, which specifies a time range for your query.
|
||||
It accepts two properties: `start` and `stop`.
|
||||
Ranges can be **relative** using negative [durations](/{{< latest "flux" >}}/spec/lexical-elements#duration-literals)
|
||||
or **absolute** using [timestamps](/{{< latest "flux" >}}/spec/lexical-elements#date-and-time-literals).
|
||||
|
||||
###### Example relative time ranges
|
||||
```js
|
||||
// Relative time range with start only. Stop defaults to now.
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|
||||
// Relative time range with start and stop
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h, stop: -10m)
|
||||
```
|
||||
|
||||
> Relative ranges are relative to "now."
|
||||
|
||||
###### Example absolute time range
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: 2018-11-05T23:30:00Z, stop: 2018-11-06T00:00:00Z)
|
||||
```
|
||||
|
||||
#### Use the following:
|
||||
For this guide, use the relative time range, `-15m`, to limit query results to data from the last 15 minutes:
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
```
|
||||
|
||||
## 3. Filter your data
|
||||
Pass your ranged data into the `filter()` function to narrow results based on data attributes or columns.
|
||||
The `filter()` function has one parameter, `fn`, which expects an anonymous function
|
||||
with logic that filters data based on columns or attributes.
|
||||
|
||||
Flux's anonymous function syntax is very similar to Javascript's.
|
||||
Records or rows are passed into the `filter()` function as a record (`r`).
|
||||
The anonymous function takes the record and evaluates it to see if it matches the defined filters.
|
||||
Use the `AND` relational operator to chain multiple filters.
|
||||
|
||||
```js
|
||||
// Pattern
|
||||
(r) => (r.recordProperty comparisonOperator comparisonExpression)
|
||||
|
||||
// Example with single filter
|
||||
(r) => (r._measurement == "cpu")
|
||||
|
||||
// Example with multiple filters
|
||||
(r) => (r._measurement == "cpu") and (r._field != "usage_system" )
|
||||
```
|
||||
|
||||
#### Use the following:
|
||||
For this example, filter by the `cpu` measurement, the `usage_system` field, and the `cpu-total` tag value:
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
```
|
||||
|
||||
## 4. Yield your queried data
|
||||
Use Flux's `yield()` function to output the filtered tables as the result of the query.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
|> yield()
|
||||
```
|
||||
|
||||
> Chronograf and the `influx` CLI automatically assume a `yield()` function at
|
||||
> the end of each script in order to output and visualize the data.
|
||||
> Best practice is to include a `yield()` function, but it is not always necessary.
|
||||
|
||||
## Congratulations!
|
||||
You have now queried data from InfluxDB using Flux.
|
||||
|
||||
The query shown here is a barebones example.
|
||||
Flux queries can be extended in many ways to form powerful scripts.
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn prev" href="/enterprise_influxdb/v1.10/flux/get-started/">Get started with Flux</a>
|
||||
<a class="btn next" href="/enterprise_influxdb/v1.10/flux/get-started/transform-data/">Transform your data</a>
|
||||
</div>
|
|
@ -0,0 +1,211 @@
|
|||
---
|
||||
title: Flux syntax basics
|
||||
description: An introduction to the basic elements of the Flux syntax with real-world application examples.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Syntax basics
|
||||
parent: get-started
|
||||
weight: 3
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/flux/getting-started/syntax-basics/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/syntax-basics/
|
||||
v2: /influxdb/v2.0/query-data/get-started/syntax-basics/
|
||||
---
|
||||
|
||||
|
||||
Flux, at its core, is a scripting language designed specifically for working with data.
|
||||
This guide walks through a handful of simple expressions and how they are handled in Flux.
|
||||
|
||||
### Simple expressions
|
||||
Flux is a scripting language that supports basic expressions.
|
||||
For example, simple addition:
|
||||
|
||||
```js
|
||||
> 1 + 1
|
||||
2
|
||||
```
|
||||
|
||||
### Variables
|
||||
Assign an expression to a variable using the assignment operator, `=`.
|
||||
|
||||
```js
|
||||
> s = "this is a string"
|
||||
> i = 1 // an integer
|
||||
> f = 2.0 // a floating point number
|
||||
```
|
||||
|
||||
Type the name of a variable to print its value:
|
||||
|
||||
```js
|
||||
> s
|
||||
this is a string
|
||||
> i
|
||||
1
|
||||
> f
|
||||
2
|
||||
```
|
||||
|
||||
### Records
|
||||
Flux also supports records. Each value in a record can be a different data type.
|
||||
|
||||
```js
|
||||
> o = {name:"Jim", age: 42, "favorite color": "red"}
|
||||
```
|
||||
|
||||
Use **dot notation** to access a properties of a record:
|
||||
|
||||
```js
|
||||
> o.name
|
||||
Jim
|
||||
> o.age
|
||||
42
|
||||
```
|
||||
|
||||
Or **bracket notation**:
|
||||
|
||||
```js
|
||||
> o["name"]
|
||||
Jim
|
||||
> o["age"]
|
||||
42
|
||||
> o["favorite color"]
|
||||
red
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Use bracket notation to reference record properties with special or
|
||||
white space characters in the property key.
|
||||
{{% /note %}}
|
||||
|
||||
### Lists
|
||||
Flux supports lists. List values must be the same type.
|
||||
|
||||
```js
|
||||
> n = 4
|
||||
> l = [1,2,3,n]
|
||||
> l
|
||||
[1, 2, 3, 4]
|
||||
```
|
||||
|
||||
### Functions
|
||||
Flux uses functions for most of its heavy lifting.
|
||||
Below is a simple function that squares a number, `n`.
|
||||
|
||||
```js
|
||||
> square = (n) => n * n
|
||||
> square(n:3)
|
||||
9
|
||||
```
|
||||
|
||||
> Flux does not support positional arguments or parameters.
|
||||
> Parameters must always be named when calling a function.
|
||||
|
||||
### Pipe-forward operator
|
||||
Flux uses the pipe-forward operator (`|>`) extensively to chain operations together.
|
||||
After each function or operation, Flux returns a table or collection of tables containing data.
|
||||
The pipe-forward operator pipes those tables into the next function where they are further processed or manipulated.
|
||||
|
||||
```js
|
||||
data |> someFunction() |> anotherFunction()
|
||||
```
|
||||
|
||||
## Real-world application of basic syntax
|
||||
This likely seems familiar if you've already been through through the other [getting started guides](/enterprise_influxdb/v1.10/flux/get-started).
|
||||
Flux's syntax is inspired by Javascript and other functional scripting languages.
|
||||
As you begin to apply these basic principles in real-world use cases such as creating data stream variables,
|
||||
custom functions, etc., the power of Flux and its ability to query and process data will become apparent.
|
||||
|
||||
The examples below provide both multi-line and single-line versions of each input command.
|
||||
Carriage returns in Flux aren't necessary, but do help with readability.
|
||||
Both single- and multi-line commands can be copied and pasted into the `influx` CLI running in Flux mode.
|
||||
|
||||
{{< tabs-wrapper >}}
|
||||
{{% tabs %}}
|
||||
[Multi-line inputs](#)
|
||||
[Single-line inputs](#)
|
||||
{{% /tabs %}}
|
||||
{{% tab-content %}}
|
||||
### Define data stream variables
|
||||
A common use case for variable assignments in Flux is creating variables for one
|
||||
or more input data streams.
|
||||
|
||||
```js
|
||||
timeRange = -1h
|
||||
|
||||
cpuUsageUser = from(bucket: "telegraf/autogen")
|
||||
|> range(start: timeRange)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_user" and r.cpu == "cpu-total")
|
||||
|
||||
memUsagePercent = from(bucket: "telegraf/autogen")
|
||||
|> range(start: timeRange)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
```
|
||||
|
||||
These variables can be used in other functions, such as `join()`, while keeping the syntax minimal and flexible.
|
||||
|
||||
### Define custom functions
|
||||
Create a function that returns the `N` number rows in the input stream with the highest `_value`s.
|
||||
To do this, pass the input stream (`tables`) and the number of results to return (`n`) into a custom function.
|
||||
Then using Flux's `sort()` and `limit()` functions to find the top `n` results in the data set.
|
||||
|
||||
```js
|
||||
topN = (tables=<-, n) => tables
|
||||
|> sort(desc: true)
|
||||
|> limit(n: n)
|
||||
```
|
||||
|
||||
_More information about creating custom functions is available in the [Custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) documentation._
|
||||
|
||||
Using this new custom function `topN` and the `cpuUsageUser` data stream variable defined above,
|
||||
find the top five data points and yield the results.
|
||||
|
||||
```js
|
||||
cpuUsageUser
|
||||
|> topN(n: 5)
|
||||
|> yield()
|
||||
```
|
||||
{{% /tab-content %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
### Define data stream variables
|
||||
A common use case for variable assignments in Flux is creating variables for multiple filtered input data streams.
|
||||
|
||||
```js
|
||||
timeRange = -1h
|
||||
|
||||
cpuUsageUser = from(bucket: "telegraf/autogen")
|
||||
|> range(start: timeRange)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_user" and r.cpu == "cpu-total")
|
||||
|
||||
memUsagePercent = from(bucket: "telegraf/autogen")
|
||||
|> range(start: timeRange)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
```
|
||||
|
||||
These variables can be used in other functions, such as `join()`, while keeping the syntax minimal and flexible.
|
||||
|
||||
### Define custom functions
|
||||
Let's create a function that returns the `N` number rows in the input data stream with the highest `_value`s.
|
||||
To do this, pass the input stream (`tables`) and the number of results to return (`n`) into a custom function.
|
||||
Then using Flux's `sort()` and `limit()` functions to find the top `n` results in the data set.
|
||||
|
||||
```js
|
||||
topN = (tables=<-, n) => tables |> sort(desc: true) |> limit(n: n)
|
||||
```
|
||||
|
||||
_More information about creating custom functions is available in the [Custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) documentation._
|
||||
|
||||
Using the `cpuUsageUser` data stream variable defined [above](#define-data-stream-variables),
|
||||
find the top five data points with the custom `topN` function and yield the results.
|
||||
|
||||
```js
|
||||
cpuUsageUser |> topN(n:5) |> yield()
|
||||
```
|
||||
{{% /tab-content %}}
|
||||
{{< /tabs-wrapper >}}
|
||||
|
||||
This query will return the five data points with the highest user CPU usage over the last hour.
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn prev" href="/enterprise_influxdb/v1.10/flux/get-started/transform-data/">Transform your data</a>
|
||||
</div>
|
|
@ -0,0 +1,160 @@
|
|||
---
|
||||
title: Transform data with Flux
|
||||
description: Learn the basics of using Flux to transform data queried from InfluxDB.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Transform your data
|
||||
parent: get-started
|
||||
weight: 2
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/flux/getting-started/transform-data/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/transform-data/
|
||||
v2: /influxdb/v2.0/query-data/get-started/transform-data/
|
||||
---
|
||||
|
||||
When [querying data from InfluxDB](/enterprise_influxdb/v1.10/flux/get-started/query-influxdb),
|
||||
you often need to transform that data in some way.
|
||||
Common examples are aggregating data into averages, downsampling data, etc.
|
||||
|
||||
This guide demonstrates using [Flux functions](/{{< latest "flux" >}}/stdlib/) to transform your data.
|
||||
It walks through creating a Flux script that partitions data into windows of time,
|
||||
averages the `_value`s in each window, and outputs the averages as a new table.
|
||||
|
||||
It's important to understand how the "shape" of your data changes through each of these operations.
|
||||
|
||||
## Query data
|
||||
Use the query built in the previous [Query data from InfluxDB](/enterprise_influxdb/v1.10/flux/get-started/query-influxdb)
|
||||
guide, but update the range to pull data from the last hour:
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
```
|
||||
|
||||
## Flux functions
|
||||
Flux provides a number of functions that perform specific operations, transformations, and tasks.
|
||||
You can also [create custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) in your Flux queries.
|
||||
_Functions are covered in detail in the [Flux standard library](/{{< latest "flux" >}}/stdlib/) documentation._
|
||||
|
||||
A common type of function used when transforming data queried from InfluxDB is an aggregate function.
|
||||
Aggregate functions take a set of `_value`s in a table, aggregate them, and transform
|
||||
them into a new value.
|
||||
|
||||
This example uses the [`mean()` function](/{{< latest "flux" >}}/stdlib/universe/mean)
|
||||
to average values within time windows.
|
||||
|
||||
> The following example walks through the steps required to window and aggregate data,
|
||||
> but there is a [`aggregateWindow()` helper function](#helper-functions) that does it for you.
|
||||
> It's just good to understand the steps in the process.
|
||||
|
||||
## Window your data
|
||||
Flux's [`window()` function](/{{< latest "flux" >}}/stdlib/universe/window) partitions records based on a time value.
|
||||
Use the `every` parameter to define a duration of time for each window.
|
||||
|
||||
{{% note %}}
|
||||
#### Calendar months and years
|
||||
`every` supports all [valid duration units](/{{< latest "flux" >}}/spec/types/#duration-types),
|
||||
including **calendar months (`1mo`)** and **years (`1y`)**.
|
||||
{{% /note %}}
|
||||
|
||||
For this example, window data in five minute intervals (`5m`).
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
|> window(every: 5m)
|
||||
```
|
||||
|
||||
As data is gathered into windows of time, each window is output as its own table.
|
||||
When visualized, each table is assigned a unique color.
|
||||
|
||||
![Windowed data tables](/img/flux/windowed-data.png)
|
||||
|
||||
## Aggregate windowed data
|
||||
Flux aggregate functions take the `_value`s in each table and aggregate them in some way.
|
||||
Use the [`mean()` function](/{{< latest "flux" >}}/stdlib/universe/mean) to average the `_value`s of each table.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
|> window(every: 5m)
|
||||
|> mean()
|
||||
```
|
||||
|
||||
As rows in each window are aggregated, their output table contains only a single row with the aggregate value.
|
||||
Windowed tables are all still separate and, when visualized, will appear as single, unconnected points.
|
||||
|
||||
![Windowed aggregate data](/img/flux/windowed-aggregates.png)
|
||||
|
||||
## Add times to your aggregates
|
||||
As values are aggregated, the resulting tables do not have a `_time` column because
|
||||
the records used for the aggregation all have different timestamps.
|
||||
Aggregate functions don't infer what time should be used for the aggregate value.
|
||||
Therefore the `_time` column is dropped.
|
||||
|
||||
A `_time` column is required in the [next operation](#unwindow-aggregate-tables).
|
||||
To add one, use the [`duplicate()` function](/{{< latest "flux" >}}/stdlib/universe/duplicate)
|
||||
to duplicate the `_stop` column as the `_time` column for each windowed table.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
|> window(every: 5m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
```
|
||||
|
||||
## Unwindow aggregate tables
|
||||
|
||||
Use the `window()` function with the `every: inf` parameter to gather all points
|
||||
into a single, infinite window.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
|> window(every: 5m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
|> window(every: inf)
|
||||
```
|
||||
|
||||
Once ungrouped and combined into a single table, the aggregate data points will appear connected in your visualization.
|
||||
|
||||
![Unwindowed aggregate data](/img/flux/windowed-aggregates-ungrouped.png)
|
||||
|
||||
## Helper functions
|
||||
This may seem like a lot of coding just to build a query that aggregates data, however going through the
|
||||
process helps to understand how data changes "shape" as it is passed through each function.
|
||||
|
||||
Flux provides (and allows you to create) "helper" functions that abstract many of these steps.
|
||||
The same operation performed in this guide can be accomplished using the
|
||||
[`aggregateWindow()` function](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow).
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system" and r.cpu == "cpu-total")
|
||||
|> aggregateWindow(every: 5m, fn: mean)
|
||||
```
|
||||
|
||||
## Congratulations!
|
||||
You have now constructed a Flux query that uses Flux functions to transform your data.
|
||||
There are many more ways to manipulate your data using both Flux's primitive functions
|
||||
and your own custom functions, but this is a good introduction into the basic syntax and query structure.
|
||||
|
||||
---
|
||||
|
||||
_For a deeper dive into windowing and aggregating data with example data output for each transformation,
|
||||
view the [Windowing and aggregating data](/enterprise_influxdb/v1.10/flux/guides/window-aggregate) guide._
|
||||
|
||||
---
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn prev" href="/enterprise_influxdb/v1.10/flux/get-started/query-influxdb/">Query InfluxDB</a>
|
||||
<a class="btn next" href="/enterprise_influxdb/v1.10/flux/get-started/syntax-basics/">Syntax basics</a>
|
||||
</div>
|
|
@ -0,0 +1,37 @@
|
|||
---
|
||||
title: Query data with Flux
|
||||
description: Guides that walk through both common and complex queries and use cases for Flux.
|
||||
weight: 3
|
||||
aliases:
|
||||
- /flux/latest/
|
||||
- /flux/latest/introduction
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Query with Flux
|
||||
parent: Flux
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/
|
||||
v2: /influxdb/v2.0/query-data/flux/
|
||||
---
|
||||
|
||||
The following guides walk through both common and complex queries and use cases for Flux.
|
||||
|
||||
{{% note %}}
|
||||
#### Example data variable
|
||||
Many of the examples provided in the following guides use a `data` variable,
|
||||
which represents a basic query that filters data by measurement and field.
|
||||
`data` is defined as:
|
||||
|
||||
```js
|
||||
data = from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement" and r._field == "example-field")
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
## Flux query guides
|
||||
|
||||
{{< children type="anchored-list" pages="all" >}}
|
||||
|
||||
---
|
||||
|
||||
{{< children pages="all" readmore="true" hr="true" >}}
|
|
@ -0,0 +1,202 @@
|
|||
---
|
||||
title: Calculate percentages with Flux
|
||||
list_title: Calculate percentages
|
||||
description: >
|
||||
Use `pivot()` or `join()` and the `map()` function to align operand values into rows and calculate a percentage.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Calculate percentages
|
||||
identifier: flux-calc-perc
|
||||
parent: Query with Flux
|
||||
weight: 6
|
||||
list_query_example: percentages
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/calculate-percentages/
|
||||
v2: /influxdb/v2.0/query-data/flux/calculate-percentages/
|
||||
---
|
||||
|
||||
Calculating percentages from queried data is a common use case for time series data.
|
||||
To calculate a percentage in Flux, operands must be in each row.
|
||||
Use `map()` to re-map values in the row and calculate a percentage.
|
||||
|
||||
**To calculate percentages**
|
||||
|
||||
1. Use [`from()`](/{{< latest "flux" >}}/stdlib/built-in/inputs/from/),
|
||||
[`range()`](/{{< latest "flux" >}}/stdlib/universe/range/) and
|
||||
[`filter()`](/{{< latest "flux" >}}/stdlib/universe/filter/) to query operands.
|
||||
2. Use [`pivot()` or `join()`](/enterprise_influxdb/v1.10/flux/guides/mathematic-operations/#pivot-vs-join)
|
||||
to align operand values into rows.
|
||||
3. Use [`map()`](/{{< latest "flux" >}}/stdlib/universe/map/)
|
||||
to divide the numerator operand value by the denominator operand value and multiply by 100.
|
||||
|
||||
{{% note %}}
|
||||
The following examples use `pivot()` to align operands into rows because
|
||||
`pivot()` works in most cases and is more performant than `join()`.
|
||||
_See [Pivot vs join](/enterprise_influxdb/v1.10/flux/guides/mathematic-operations/#pivot-vs-join)._
|
||||
{{% /note %}}
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "m1" and r._field =~ /field[1-2]/)
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({r with _value: r.field1 / r.field2 * 100.0}))
|
||||
```
|
||||
|
||||
## GPU monitoring example
|
||||
The following example queries data from the gpu-monitor bucket and calculates the
|
||||
percentage of GPU memory used over time.
|
||||
Data includes the following:
|
||||
|
||||
- **`gpu` measurement**
|
||||
- **`mem_used` field**: used GPU memory in bytes
|
||||
- **`mem_total` field**: total GPU memory in bytes
|
||||
|
||||
### Query mem_used and mem_total fields
|
||||
```js
|
||||
from(bucket: "gpu-monitor")
|
||||
|> range(start: 2020-01-01T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "gpu" and r._field =~ /mem_/)
|
||||
```
|
||||
|
||||
###### Returns the following stream of tables:
|
||||
|
||||
| _time | _measurement | _field | _value |
|
||||
|:----- |:------------:|:------: | ------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | mem_used | 2517924577 |
|
||||
| 2020-01-01T00:00:10Z | gpu | mem_used | 2695091978 |
|
||||
| 2020-01-01T00:00:20Z | gpu | mem_used | 2576980377 |
|
||||
| 2020-01-01T00:00:30Z | gpu | mem_used | 3006477107 |
|
||||
| 2020-01-01T00:00:40Z | gpu | mem_used | 3543348019 |
|
||||
| 2020-01-01T00:00:50Z | gpu | mem_used | 4402341478 |
|
||||
|
||||
<p style="margin:-2.5rem 0;"></p>
|
||||
|
||||
| _time | _measurement | _field | _value |
|
||||
|:----- |:------------:|:------: | ------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:10Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:20Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:30Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:40Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:50Z | gpu | mem_total | 8589934592 |
|
||||
|
||||
### Pivot fields into columns
|
||||
Use `pivot()` to pivot the `mem_used` and `mem_total` fields into columns.
|
||||
Output includes `mem_used` and `mem_total` columns with values for each corresponding `_time`.
|
||||
|
||||
```js
|
||||
// ...
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
```
|
||||
|
||||
###### Returns the following:
|
||||
|
||||
| _time | _measurement | mem_used | mem_total |
|
||||
|:----- |:------------:| --------: | ---------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | 2517924577 | 8589934592 |
|
||||
| 2020-01-01T00:00:10Z | gpu | 2695091978 | 8589934592 |
|
||||
| 2020-01-01T00:00:20Z | gpu | 2576980377 | 8589934592 |
|
||||
| 2020-01-01T00:00:30Z | gpu | 3006477107 | 8589934592 |
|
||||
| 2020-01-01T00:00:40Z | gpu | 3543348019 | 8589934592 |
|
||||
| 2020-01-01T00:00:50Z | gpu | 4402341478 | 8589934592 |
|
||||
|
||||
### Map new values
|
||||
Each row now contains the values necessary to calculate a percentage.
|
||||
Use `map()` to re-map values in each row.
|
||||
Divide `mem_used` by `mem_total` and multiply by 100 to return the percentage.
|
||||
|
||||
{{% note %}}
|
||||
To return a precise float percentage value that includes decimal points, the example
|
||||
below casts integer field values to floats and multiplies by a float value (`100.0`).
|
||||
{{% /note %}}
|
||||
|
||||
```js
|
||||
// ...
|
||||
|> map(
|
||||
fn: (r) => ({
|
||||
_time: r._time,
|
||||
_measurement: r._measurement,
|
||||
_field: "mem_used_percent",
|
||||
_value: float(v: r.mem_used) / float(v: r.mem_total) * 100.0
|
||||
})
|
||||
)
|
||||
```
|
||||
##### Query results:
|
||||
|
||||
| _time | _measurement | _field | _value |
|
||||
|:----- |:------------:|:------: | ------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | mem_used_percent | 29.31 |
|
||||
| 2020-01-01T00:00:10Z | gpu | mem_used_percent | 31.37 |
|
||||
| 2020-01-01T00:00:20Z | gpu | mem_used_percent | 30.00 |
|
||||
| 2020-01-01T00:00:30Z | gpu | mem_used_percent | 35.00 |
|
||||
| 2020-01-01T00:00:40Z | gpu | mem_used_percent | 41.25 |
|
||||
| 2020-01-01T00:00:50Z | gpu | mem_used_percent | 51.25 |
|
||||
|
||||
### Full query
|
||||
```js
|
||||
from(bucket: "gpu-monitor")
|
||||
|> range(start: 2020-01-01T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "gpu" and r._field =~ /mem_/ )
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(
|
||||
fn: (r) => ({
|
||||
_time: r._time,
|
||||
_measurement: r._measurement,
|
||||
_field: "mem_used_percent",
|
||||
_value: float(v: r.mem_used) / float(v: r.mem_total) * 100.0
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
#### Calculate percentages using multiple fields
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement")
|
||||
|> filter(fn: (r) => r._field == "used_system" or r._field == "used_user" or r._field == "total")
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({r with _value: float(v: r.used_system + r.used_user) / float(v: r.total) * 100.0}))
|
||||
```
|
||||
|
||||
#### Calculate percentages using multiple measurements
|
||||
|
||||
1. Ensure measurements are in the same [bucket](/enterprise_influxdb/v1.10/flux/get-started/#buckets).
|
||||
2. Use `filter()` to include data from both measurements.
|
||||
3. Use `group()` to ungroup data and return a single table.
|
||||
4. Use `pivot()` to pivot fields into columns.
|
||||
5. Use `map()` to re-map rows and perform the percentage calculation.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => (r._measurement == "m1" or r._measurement == "m2") and (r._field == "field1" or r._field == "field2"))
|
||||
|> group()
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({r with _value: r.field1 / r.field2 * 100.0}))
|
||||
```
|
||||
|
||||
#### Calculate percentages using multiple data sources
|
||||
```js
|
||||
import "sql"
|
||||
import "influxdata/influxdb/secrets"
|
||||
|
||||
pgUser = secrets.get(key: "POSTGRES_USER")
|
||||
pgPass = secrets.get(key: "POSTGRES_PASSWORD")
|
||||
pgHost = secrets.get(key: "POSTGRES_HOST")
|
||||
|
||||
t1 = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://${pgUser}:${pgPass}@${pgHost}",
|
||||
query: "SELECT id, name, available FROM exampleTable",
|
||||
)
|
||||
|
||||
t2 = from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement" and r._field == "example-field")
|
||||
|
||||
join(tables: {t1: t1, t2: t2}, on: ["id"])
|
||||
|> map(fn: (r) => ({r with _value: r._value_t2 / r.available_t1 * 100.0}))
|
||||
```
|
|
@ -0,0 +1,213 @@
|
|||
---
|
||||
title: Query using conditional logic
|
||||
seotitle: Query using conditional logic in Flux
|
||||
list_title: Conditional logic
|
||||
description: >
|
||||
This guide describes how to use Flux conditional expressions, such as `if`,
|
||||
`else`, and `then`, to query and transform data. **Flux evaluates statements from left to right and stops evaluating once a condition matches.**
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Conditional logic
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/conditional-logic/
|
||||
v2: /influxdb/v2.0/query-data/flux/conditional-logic/
|
||||
list_code_example: |
|
||||
```js
|
||||
if color == "green" then "008000" else "ffffff"
|
||||
```
|
||||
---
|
||||
|
||||
Flux provides `if`, `then`, and `else` conditional expressions that allow for powerful and flexible Flux queries.
|
||||
|
||||
##### Conditional expression syntax
|
||||
```js
|
||||
// Pattern
|
||||
if <condition> then <action> else <alternative-action>
|
||||
|
||||
// Example
|
||||
if color == "green" then "008000" else "ffffff"
|
||||
```
|
||||
|
||||
Conditional expressions are most useful in the following contexts:
|
||||
|
||||
- When defining variables.
|
||||
- When using functions that operate on a single row at a time (
|
||||
[`filter()`](/{{< latest "flux" >}}/stdlib/universe/filter/),
|
||||
[`map()`](/{{< latest "flux" >}}/stdlib/universe/map/),
|
||||
[`reduce()`](/{{< latest "flux" >}}/stdlib/universe/reduce) ).
|
||||
|
||||
## Evaluating conditional expressions
|
||||
|
||||
Flux evaluates statements in order and stops evaluating once a condition matches.
|
||||
|
||||
For example, given the following statement:
|
||||
|
||||
```js
|
||||
if r._value > 95.0000001 and r._value <= 100.0 then
|
||||
"critical"
|
||||
else if r._value > 85.0000001 and r._value <= 95.0 then
|
||||
"warning"
|
||||
else if r._value > 70.0000001 and r._value <= 85.0 then
|
||||
"high"
|
||||
else
|
||||
"normal"
|
||||
```
|
||||
|
||||
When `r._value` is 96, the output is "critical" and the remaining conditions are not evaluated.
|
||||
|
||||
## Examples
|
||||
|
||||
- [Conditionally set the value of a variable](#conditionally-set-the-value-of-a-variable)
|
||||
- [Create conditional filters](#create-conditional-filters)
|
||||
- [Conditionally transform column values with map()](#conditionally-transform-column-values-with-map)
|
||||
- [Conditionally increment a count with reduce()](#conditionally-increment-a-count-with-reduce)
|
||||
|
||||
### Conditionally set the value of a variable
|
||||
The following example sets the `overdue` variable based on the
|
||||
`dueDate` variable's relation to `now()`.
|
||||
|
||||
```js
|
||||
dueDate = 2019-05-01T00:00:00Z
|
||||
overdue = if dueDate < now() then true else false
|
||||
```
|
||||
|
||||
### Create conditional filters
|
||||
The following example uses an example `metric` variable to change how the query filters data.
|
||||
`metric` has three possible values:
|
||||
|
||||
- Memory
|
||||
- CPU
|
||||
- Disk
|
||||
|
||||
```js
|
||||
metric = "Memory"
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(
|
||||
fn: (r) => if v.metric == "Memory" then
|
||||
r._measurement == "mem" and r._field == "used_percent"
|
||||
else if v.metric == "CPU" then
|
||||
r._measurement == "cpu" and r._field == "usage_user"
|
||||
else if v.metric == "Disk" then
|
||||
r._measurement == "disk" and r._field == "used_percent"
|
||||
else
|
||||
r._measurement != "",
|
||||
)
|
||||
```
|
||||
|
||||
### Conditionally transform column values with map()
|
||||
The following example uses the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map/)
|
||||
to conditionally transform column values.
|
||||
It sets the `level` column to a specific string based on `_value` column.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Comments](#)
|
||||
[Comments](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> map(
|
||||
fn: (r) => ({r with
|
||||
level: if r._value >= 95.0000001 and r._value <= 100.0 then
|
||||
"critical"
|
||||
else if r._value >= 85.0000001 and r._value <= 95.0 then
|
||||
"warning"
|
||||
else if r._value >= 70.0000001 and r._value <= 85.0 then
|
||||
"high"
|
||||
else
|
||||
"normal",
|
||||
}),
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> map(
|
||||
fn: (r) => ({
|
||||
// Retain all existing columns in the mapped row
|
||||
r with
|
||||
// Set the level column value based on the _value column
|
||||
level: if r._value >= 95.0000001 and r._value <= 100.0 then
|
||||
"critical"
|
||||
else if r._value >= 85.0000001 and r._value <= 95.0 then
|
||||
"warning"
|
||||
else if r._value >= 70.0000001 and r._value <= 85.0 then
|
||||
"high"
|
||||
else
|
||||
"normal",
|
||||
}),
|
||||
)
|
||||
```
|
||||
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
### Conditionally increment a count with reduce()
|
||||
The following example uses the [`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
|
||||
and [`reduce()`](/{{< latest "flux" >}}/stdlib/universe/reduce/)
|
||||
functions to count the number of records in every five minute window that exceed a defined threshold.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Comments](#)
|
||||
[Comments](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
threshold = 65.0
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> aggregateWindow(
|
||||
every: 5m,
|
||||
fn: (column, tables=<-) => tables
|
||||
|> reduce(
|
||||
identity: {above_threshold_count: 0.0},
|
||||
fn: (r, accumulator) => ({
|
||||
above_threshold_count: if r._value >= threshold then
|
||||
accumulator.above_threshold_count + 1.0
|
||||
else
|
||||
accumulator.above_threshold_count + 0.0,
|
||||
}),
|
||||
),
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
threshold = 65.0
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
// Aggregate data into 5 minute windows using a custom reduce() function
|
||||
|> aggregateWindow(
|
||||
every: 5m,
|
||||
// Use a custom function in the fn parameter.
|
||||
// The aggregateWindow fn parameter requires 'column' and 'tables' parameters.
|
||||
fn: (column, tables=<-) => tables
|
||||
|> reduce(
|
||||
identity: {above_threshold_count: 0.0},
|
||||
fn: (r, accumulator) => ({
|
||||
// Conditionally increment above_threshold_count if
|
||||
// r.value exceeds the threshold
|
||||
above_threshold_count: if r._value >= threshold then
|
||||
accumulator.above_threshold_count + 1.0
|
||||
else
|
||||
accumulator.above_threshold_count + 0.0,
|
||||
}),
|
||||
),
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
|
@ -0,0 +1,69 @@
|
|||
---
|
||||
title: Query cumulative sum
|
||||
seotitle: Query cumulative sum in Flux
|
||||
list_title: Cumulative sum
|
||||
description: >
|
||||
Use the `cumulativeSum()` function to calculate a running total of values.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: Cumulative sum
|
||||
list_query_example: cumulative_sum
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/cumulativesum/
|
||||
v2: /influxdb/v2.0/query-data/flux/cumulativesum/
|
||||
---
|
||||
|
||||
Use the [`cumulativeSum()` function](/{{< latest "flux" >}}/stdlib/universe/cumulativesum/)
|
||||
to calculate a running total of values.
|
||||
`cumulativeSum` sums the values of subsequent records and returns each row updated with the summed total.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content "half" %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
| ----- |:------:|
|
||||
| 0001 | 1 |
|
||||
| 0002 | 2 |
|
||||
| 0003 | 1 |
|
||||
| 0004 | 3 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content "half" %}}
|
||||
**`cumulativeSum()` returns:**
|
||||
|
||||
| _time | _value |
|
||||
| ----- |:------:|
|
||||
| 0001 | 1 |
|
||||
| 0002 | 3 |
|
||||
| 0003 | 4 |
|
||||
| 0004 | 7 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
{{% note %}}
|
||||
The examples below use the [example data variable](/enterprise_influxdb/v1.10/flux/guides/#example-data-variable).
|
||||
{{% /note %}}
|
||||
|
||||
##### Calculate the running total of values
|
||||
```js
|
||||
data
|
||||
|> cumulativeSum()
|
||||
```
|
||||
|
||||
## Use cumulativeSum() with aggregateWindow()
|
||||
[`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
|
||||
segments data into windows of time, aggregates data in each window into a single
|
||||
point, then removes the time-based segmentation.
|
||||
It is primarily used to downsample data.
|
||||
|
||||
`aggregateWindow()` expects an aggregate function that returns a single row for each time window.
|
||||
To use `cumulativeSum()` with `aggregateWindow`, use `sum` in `aggregateWindow()`,
|
||||
then calculate the running total of the aggregate values with `cumulativeSum()`.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
data
|
||||
|> aggregateWindow(every: 5m, fn: sum)
|
||||
|> cumulativeSum()
|
||||
```
|
|
@ -0,0 +1,84 @@
|
|||
---
|
||||
title: Check if a value exists
|
||||
seotitle: Use Flux to check if a value exists
|
||||
list_title: Exists
|
||||
description: >
|
||||
Use the Flux `exists` operator to check if a record contains a key or if that
|
||||
key's value is `null`.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Exists
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/exists/
|
||||
v2: /influxdb/v2.0/query-data/flux/exists/
|
||||
list_code_example: |
|
||||
##### Filter null values
|
||||
```js
|
||||
data
|
||||
|> filter(fn: (r) => exists r._value)
|
||||
```
|
||||
---
|
||||
|
||||
Use the Flux `exists` operator to check if a record contains a key or if that
|
||||
key's value is `null`.
|
||||
|
||||
```js
|
||||
p = {firstName: "John", lastName: "Doe", age: 42}
|
||||
|
||||
exists p.firstName
|
||||
// Returns true
|
||||
|
||||
exists p.height
|
||||
// Returns false
|
||||
```
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
Use `exists` with row functions (
|
||||
[`filter()`](/{{< latest "flux" >}}/stdlib/universe/filter/),
|
||||
[`map()`](/{{< latest "flux" >}}/stdlib/universe/map/),
|
||||
[`reduce()`](/{{< latest "flux" >}}/stdlib/universe/reduce/))
|
||||
to check if a row includes a column or if the value for that column is `null`.
|
||||
|
||||
#### Filter null values
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => exists r._value)
|
||||
```
|
||||
|
||||
#### Map values based on existence
|
||||
```js
|
||||
from(bucket: "default")
|
||||
|> range(start: -30s)
|
||||
|> map(
|
||||
fn: (r) => ({r with
|
||||
human_readable: if exists r._value then
|
||||
"${r._field} is ${string(v: r._value)}."
|
||||
else
|
||||
"${r._field} has no value.",
|
||||
}),
|
||||
)
|
||||
```
|
||||
|
||||
#### Ignore null values in a custom aggregate function
|
||||
```js
|
||||
customSumProduct = (tables=<-) => tables
|
||||
|> reduce(
|
||||
identity: {sum: 0.0, product: 1.0},
|
||||
fn: (r, accumulator) => ({r with
|
||||
sum: if exists r._value then
|
||||
r._value + accumulator.sum
|
||||
else
|
||||
accumulator.sum,
|
||||
product: if exists r._value then
|
||||
r.value * accumulator.product
|
||||
else
|
||||
accumulator.product,
|
||||
}),
|
||||
)
|
||||
```
|
|
@ -0,0 +1,112 @@
|
|||
---
|
||||
title: Fill null values in data
|
||||
seotitle: Fill null values in data
|
||||
list_title: Fill
|
||||
description: >
|
||||
Use the `fill()` function to replace _null_ values.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: Fill
|
||||
list_query_example: fill_null
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/fill/
|
||||
v2: /influxdb/v2.0/query-data/flux/fill/
|
||||
---
|
||||
|
||||
Use the [`fill()` function](/{{< latest "flux" >}}/stdlib/universe/fill/)
|
||||
to replace _null_ values with:
|
||||
|
||||
- [the previous non-null value](#fill-with-the-previous-value)
|
||||
- [a specified value](#fill-with-a-specified-value)
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
data
|
||||
|> fill(usePrevious: true)
|
||||
|
||||
// OR
|
||||
|
||||
data
|
||||
|> fill(value: 0.0)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
#### Fill empty windows of time
|
||||
The `fill()` function **does not** fill empty windows of time.
|
||||
It only replaces _null_ values in existing data.
|
||||
Filling empty windows of time requires time interpolation
|
||||
_(see [influxdata/flux#2428](https://github.com/influxdata/flux/issues/2428))_.
|
||||
{{% /note %}}
|
||||
|
||||
## Fill with the previous value
|
||||
To fill _null_ values with the previous **non-null** value, set the `usePrevious` parameter to `true`.
|
||||
|
||||
{{% note %}}
|
||||
Values remain _null_ if there is no previous non-null value in the table.
|
||||
{{% /note %}}
|
||||
|
||||
```js
|
||||
data
|
||||
|> fill(usePrevious: true)
|
||||
```
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | null |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | null |
|
||||
| 2020-01-01T00:04:00Z | null |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`fill(usePrevious: true)` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | null |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | 0.8 |
|
||||
| 2020-01-01T00:04:00Z | 0.8 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
## Fill with a specified value
|
||||
To fill _null_ values with a specified value, use the `value` parameter to specify the fill value.
|
||||
_The fill value must match the [data type](/{{< latest "flux" >}}/language/types/#basic-types)
|
||||
of the [column](/{{< latest "flux" >}}/stdlib/universe/fill/#column)._
|
||||
|
||||
```js
|
||||
data
|
||||
|> fill(value: 0.0)
|
||||
```
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | null |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | null |
|
||||
| 2020-01-01T00:04:00Z | null |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`fill(value: 0.0)` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 0.0 |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | 0.0 |
|
||||
| 2020-01-01T00:04:00Z | 0.0 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,149 @@
|
|||
---
|
||||
title: Query first and last values
|
||||
seotitle: Query first and last values in Flux
|
||||
list_title: First and last
|
||||
description: >
|
||||
Use the `first()` or `last()` functions to return the first or last point in an input table.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: First & last
|
||||
list_query_example: first_last
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/first-last/
|
||||
v2: /influxdb/v2.0/query-data/flux/first-last/
|
||||
---
|
||||
|
||||
Use the [`first()`](/{{< latest "flux" >}}/stdlib/universe/first/) or
|
||||
[`last()`](/{{< latest "flux" >}}/stdlib/universe/last/) functions
|
||||
to return the first or last record in an input table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> first()
|
||||
|
||||
// OR
|
||||
|
||||
data
|
||||
|> last()
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
By default, InfluxDB returns results sorted by time, however you can use the
|
||||
[`sort()` function](/{{< latest "flux" >}}/stdlib/universe/sort/)
|
||||
to change how results are sorted.
|
||||
`first()` and `last()` respect the sort order of input data and return records
|
||||
based on the order they are received in.
|
||||
{{% /note %}}
|
||||
|
||||
### first
|
||||
`first()` returns the first non-null record in an input table.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following function returns:**
|
||||
```js
|
||||
|> first()
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### last
|
||||
`last()` returns the last non-null record in an input table.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following function returns:**
|
||||
|
||||
```js
|
||||
|> last()
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
## Use first() or last() with aggregateWindow()
|
||||
Use `first()` and `last()` with [`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
|
||||
to select the first or last records in time-based groups.
|
||||
`aggregateWindow()` segments data into windows of time, aggregates data in each window into a single
|
||||
point using aggregate or selector functions, and then removes the time-based segmentation.
|
||||
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 10 |
|
||||
| 2020-01-01T00:00:15Z | 12 |
|
||||
| 2020-01-01T00:00:45Z | 9 |
|
||||
| 2020-01-01T00:01:05Z | 9 |
|
||||
| 2020-01-01T00:01:10Z | 15 |
|
||||
| 2020-01-01T00:02:30Z | 11 |
|
||||
{{% /flex-content %}}
|
||||
|
||||
{{% flex-content %}}
|
||||
**The following function returns:**
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[first](#)
|
||||
[last](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
|> aggregateWindow(
|
||||
every: 1h,
|
||||
fn: first,
|
||||
)
|
||||
```
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:59Z | 10 |
|
||||
| 2020-01-01T00:01:59Z | 9 |
|
||||
| 2020-01-01T00:02:59Z | 11 |
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
|> aggregateWindow(
|
||||
every: 1h,
|
||||
fn: last,
|
||||
)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:59Z | 9 |
|
||||
| 2020-01-01T00:01:59Z | 15 |
|
||||
| 2020-01-01T00:02:59Z | 11 |
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
{{%/flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,137 @@
|
|||
---
|
||||
title: Use Flux in Chronograf dashboards
|
||||
description: >
|
||||
This guide walks through using Flux queries in Chronograf dashboard cells,
|
||||
what template variables are available, and how to use them.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Use Flux in dashboards
|
||||
parent: Query with Flux
|
||||
weight: 30
|
||||
canonical: /{{< latest "influxdb" "v1" >}}/flux/guides/flux-in-dashboards/
|
||||
---
|
||||
|
||||
[Chronograf](/{{< latest "chronograf" >}}/) is the web user interface for managing for the
|
||||
InfluxData platform that lest you create and customize dashboards that visualize your data.
|
||||
Visualized data is retrieved using either an InfluxQL or Flux query.
|
||||
This guide walks through using Flux queries in Chronograf dashboard cells.
|
||||
|
||||
## Using Flux in dashboard cells
|
||||
|
||||
---
|
||||
|
||||
_**Chronograf v1.8+** and **InfluxDB v1.8 with [Flux enabled](/enterprise_influxdb/v1.10/flux/installation)**
|
||||
are required to use Flux in dashboards._
|
||||
|
||||
---
|
||||
|
||||
To use Flux in a dashboard cell, either create a new cell or edit an existing cell
|
||||
by clicking the **pencil** icon in the top right corner of the cell.
|
||||
To the right of the **Source dropdown** above the graph preview, select **Flux** as the source type.
|
||||
|
||||
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-cell.png" alt="Flux in Chronograf dashboard cells" />}}
|
||||
|
||||
> The Flux source type is only available if your data source has
|
||||
> [Flux enabled](/enterprise_influxdb/v1.10/flux/installation).
|
||||
|
||||
This will provide **Schema**, **Script**, and **Functions** panes.
|
||||
|
||||
### Schema pane
|
||||
The Schema pane allows you to explore your data and add filters for specific
|
||||
measurements, fields, and tags to your Flux script.
|
||||
|
||||
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-add-filter.png" title="Add a filter from the Schema panel" />}}
|
||||
|
||||
### Script pane
|
||||
The Script pane is where you write your Flux script.
|
||||
In its default state, the **Script** pane includes an optional [Script Wizard](/chronograf/v1.8/guides/querying-data/#explore-data-with-flux)
|
||||
that uses selected options to build a Flux query for you.
|
||||
The generated query includes all the relevant functions and [template variables](#template-variables-in-flux)
|
||||
required to return your desired data.
|
||||
|
||||
### Functions pane
|
||||
The Functions pane provides a list of functions available in your Flux queries.
|
||||
Clicking on a function will add it to the end of the script in the Script pane.
|
||||
Hovering over a function provides documentation for the function as well as links
|
||||
to deep documentation.
|
||||
|
||||
### Dynamic sources
|
||||
Chronograf can be configured with multiple data sources.
|
||||
The **Sources dropdown** allows you to select a specific data source to connect to,
|
||||
but a **Dynamic Source** options is also available.
|
||||
With a dynamic source, the cell will query data from whatever data source to which
|
||||
Chronograf is currently connected.
|
||||
Connections are managed under Chronograf's **Configuration** tab.
|
||||
|
||||
### View raw data
|
||||
As you're building your Flux scripts, each function processes or transforms your
|
||||
data is ways specific to the function.
|
||||
It can be helpful to view the actual data in order to see how it is being shaped.
|
||||
The **View Raw Data** toggle above the data visualization switches between graphed
|
||||
data and raw data shown in table form.
|
||||
|
||||
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-view-raw.png" alt="View raw data" />}}
|
||||
|
||||
_The **View Raw Data** toggle is only available when using Flux._
|
||||
|
||||
## Template variables in Flux
|
||||
Chronograf [template variables](/{{< latest "chronograf" >}}/guides/dashboard-template-variables/)
|
||||
allow you to alter specific components of cells’ queries using elements provided in the
|
||||
Chronograf user interface.
|
||||
|
||||
In your Flux query, reference template variables just as you would reference defined Flux variables.
|
||||
The following example uses Chronograf's [predefined template variables](#predefined-template-variables),
|
||||
`dashboardTime`, `upperDashboardTime`, and `autoInterval`:
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> filter(fn: (r) => r._measurement == "cpu")
|
||||
|> range(start: dashboardTime, stop: upperDashboardTime)
|
||||
|> window(every: autoInterval)
|
||||
```
|
||||
|
||||
### Predefined template variables
|
||||
|
||||
#### dashboardTime
|
||||
The `dashboardTime` template variable represents the lower time bound of ranged data.
|
||||
It's value is controlled by the time dropdown in your dashboard.
|
||||
It should be used to define the `start` parameter of the `range()` function.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> range(start: dashboardTime)
|
||||
```
|
||||
|
||||
#### upperDashboardTime
|
||||
The `upperDashboardTime` template variable represents the upper time bound of ranged data.
|
||||
It's value is modified by the time dropdown in your dashboard when using an absolute time range.
|
||||
It should be used to define the `stop` parameter of the `range()` function.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> range(start: dashboardTime, stop: upperDashboardTime)
|
||||
```
|
||||
> As a best practice, always set the `stop` parameter of the `range()` function to `upperDashboardTime` in cell queries.
|
||||
> Without it, `stop` defaults to "now" and the absolute upper range bound selected in the time dropdown is not honored,
|
||||
> potentially causing unnecessary load on InfluxDB.
|
||||
|
||||
#### autoInterval
|
||||
The `autoInterval` template variable represents the refresh interval of the dashboard
|
||||
and is controlled by the refresh interval dropdown.
|
||||
It's typically used to align window intervals created in
|
||||
[windowing and aggregation](/enterprise_influxdb/v1.10/flux/guides/window-aggregate) operations with dashboard refreshes.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> range(start: dashboardTime, stop: upperDashboardTime)
|
||||
|> aggregateWindow(every: autoInterval, fn: mean)
|
||||
```
|
||||
|
||||
### Custom template variables
|
||||
<% warn %>
|
||||
Chronograf does not support the use of custom template variables in Flux queries.
|
||||
<% /warn %>
|
||||
|
||||
## Using Flux and InfluxQL
|
||||
Within individual dashboard cells, the use of Flux and InfluxQL is mutually exclusive.
|
||||
However, a dashboard may consist of different cells, each using Flux or InfluxQL.
|
|
@ -0,0 +1,91 @@
|
|||
---
|
||||
title: Work with geo-temporal data
|
||||
list_title: Geo-temporal data
|
||||
description: >
|
||||
Use the Flux Geo package to filter geo-temporal data and group by geographic location or track.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Geo-temporal data
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|
||||
|> geo.groupByArea(newColumn: "geoArea", level: 5)
|
||||
```
|
||||
---
|
||||
|
||||
Use the [Flux Geo package](/{{< latest "flux" >}}/stdlib/experimental/geo) to
|
||||
filter geo-temporal data and group by geographic location or track.
|
||||
|
||||
{{% warn %}}
|
||||
The Geo package is experimental and subject to change at any time.
|
||||
By using it, you agree to the [risks of experimental functions](/{{< latest "flux" >}}/stdlib/experimental/#experimental-functions-are-subject-to-change).
|
||||
{{% /warn %}}
|
||||
|
||||
**To work with geo-temporal data:**
|
||||
|
||||
1. Import the `experimental/geo` package.
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
```
|
||||
|
||||
2. Load geo-temporal data. _See below for [sample geo-temporal data](#sample-data)._
|
||||
3. Do one or more of the following:
|
||||
|
||||
- [Shape data to work with the Geo package](#shape-data-to-work-with-the-geo-package)
|
||||
- [Filter data by region](#filter-geo-temporal-data-by-region) (using strict or non-strict filters)
|
||||
- [Group data by area or by track](#group-geo-temporal-data)
|
||||
|
||||
{{< children >}}
|
||||
|
||||
---
|
||||
|
||||
## Sample data
|
||||
Many of the examples in this section use a `sampleGeoData` variable that represents
|
||||
a sample set of geo-temporal data.
|
||||
The [Bird Migration Sample Data](https://github.com/influxdata/influxdb2-sample-data/tree/master/bird-migration-data)
|
||||
available on GitHub provides sample geo-temporal data that meets the
|
||||
[requirements of the Flux Geo package](/{{< latest "flux" >}}/stdlib/experimental/geo/#geo-schema-requirements).
|
||||
|
||||
### Load annotated CSV sample data
|
||||
Use the [experimental `csv.from()` function](/{{< latest "flux" >}}/stdlib/experimental/csv/from/)
|
||||
to load the sample bird migration annotated CSV data from GitHub:
|
||||
|
||||
```js
|
||||
import `experimental/csv`
|
||||
|
||||
sampleGeoData = csv.from(
|
||||
url: "https://github.com/influxdata/influxdb2-sample-data/blob/master/bird-migration-data/bird-migration.csv"
|
||||
)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
`csv.from(url: ...)` downloads sample data each time you execute the query **(~1.3 MB)**.
|
||||
If bandwidth is a concern, use the [`to()` function](/{{< latest "flux" >}}/stdlib/built-in/outputs/to/)
|
||||
to write the data to a bucket, and then query the bucket with [`from()`](/{{< latest "flux" >}}/stdlib/built-in/inputs/from/).
|
||||
{{% /note %}}
|
||||
|
||||
### Write sample data to InfluxDB with line protocol
|
||||
Use `curl` and the `influx write` command to write bird migration line protocol to InfluxDB.
|
||||
Replace `db/rp` with your destination bucket:
|
||||
|
||||
```sh
|
||||
curl https://raw.githubusercontent.com/influxdata/influxdb2-sample-data/master/bird-migration-data/bird-migration.line --output ./tmp-data
|
||||
influx write -b db/rp @./tmp-data
|
||||
rm -f ./tmp-data
|
||||
```
|
||||
|
||||
Use Flux to query the bird migration data and assign it to the `sampleGeoData` variable:
|
||||
|
||||
```js
|
||||
sampleGeoData = from(bucket: "db/rp")
|
||||
|> range(start: 2019-01-01T00:00:00Z, stop: 2019-12-31T23:59:59Z)
|
||||
|> filter(fn: (r) => r._measurement == "migration")
|
||||
```
|
|
@ -0,0 +1,129 @@
|
|||
---
|
||||
title: Filter geo-temporal data by region
|
||||
description: >
|
||||
Use the `geo.filterRows` function to filter geo-temporal data by box-shaped, circular, or polygonal geographic regions.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Filter by region
|
||||
parent: Geo-temporal data
|
||||
weight: 302
|
||||
related:
|
||||
- /{{< latest "flux" >}}/stdlib/experimental/geo/
|
||||
- /{{< latest "flux" >}}/stdlib/experimental/geo/filterrows/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/filter-by-region/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/filter-by-region/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(
|
||||
region: {lat: 30.04, lon: 31.23, radius: 200.0},
|
||||
strict: true
|
||||
)
|
||||
```
|
||||
---
|
||||
|
||||
Use the [`geo.filterRows` function](/{{< latest "flux" >}}/stdlib/experimental/geo/filterrows/)
|
||||
to filter geo-temporal data by geographic region:
|
||||
|
||||
1. [Define a geographic region](#define-a-geographic-region)
|
||||
2. [Use strict or non-strict filtering](#strict-and-non-strict-filtering)
|
||||
|
||||
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.10/flux/guides/geo/#sample-data)
|
||||
and queries data points **within 200km of Cairo, Egypt**:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0}, strict: true)
|
||||
```
|
||||
|
||||
## Define a geographic region
|
||||
Many functions in the Geo package filter data based on geographic region.
|
||||
Define a geographic region using one of the the following shapes:
|
||||
|
||||
- [box](#box)
|
||||
- [circle](#circle)
|
||||
- [polygon](#polygon)
|
||||
|
||||
### box
|
||||
Define a box-shaped region by specifying a record containing the following properties:
|
||||
|
||||
- **minLat:** minimum latitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **maxLat:** maximum latitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **minLon:** minimum longitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **maxLon:** maximum longitude in decimal degrees (WGS 84) _(Float)_
|
||||
|
||||
##### Example box-shaped region
|
||||
```js
|
||||
{
|
||||
minLat: 40.51757813,
|
||||
maxLat: 40.86914063,
|
||||
minLon: -73.65234375,
|
||||
maxLon: -72.94921875,
|
||||
}
|
||||
```
|
||||
|
||||
### circle
|
||||
Define a circular region by specifying a record containing the following properties:
|
||||
|
||||
- **lat**: latitude of the circle center in decimal degrees (WGS 84) _(Float)_
|
||||
- **lon**: longitude of the circle center in decimal degrees (WGS 84) _(Float)_
|
||||
- **radius**: radius of the circle in kilometers (km) _(Float)_
|
||||
|
||||
##### Example circular region
|
||||
```js
|
||||
{
|
||||
lat: 40.69335938,
|
||||
lon: -73.30078125,
|
||||
radius: 20.0,
|
||||
}
|
||||
```
|
||||
|
||||
### polygon
|
||||
Define a polygonal region with a record containing the latitude and longitude for
|
||||
each point in the polygon:
|
||||
|
||||
- **points**: points that define the custom polygon _(Array of records)_
|
||||
|
||||
Define each point with a record containing the following properties:
|
||||
|
||||
- **lat**: latitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **lon**: longitude in decimal degrees (WGS 84) _(Float)_
|
||||
|
||||
##### Example polygonal region
|
||||
```js
|
||||
{
|
||||
points: [
|
||||
{lat: 40.671659, lon: -73.936631},
|
||||
{lat: 40.706543, lon: -73.749177},
|
||||
{lat: 40.791333, lon: -73.880327},
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Strict and non-strict filtering
|
||||
In most cases, the specified geographic region does not perfectly align with S2 grid cells.
|
||||
|
||||
- **Non-strict filtering** returns points that may be outside of the specified region but
|
||||
inside S2 grid cells partially covered by the region.
|
||||
- **Strict filtering** returns only points inside the specified region.
|
||||
|
||||
_Strict filtering is less performant, but more accurate than non-strict filtering._
|
||||
|
||||
<span class="key-geo-cell"></span> S2 grid cell
|
||||
<span class="key-geo-region"></span> Filter region
|
||||
<span class="key-geo-point"></span> Returned point
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Strict filtering**
|
||||
{{< svg "/static/svgs/geo-strict.svg" >}}
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**Non-strict filtering**
|
||||
{{< svg "/static/svgs/geo-non-strict.svg" >}}
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
title: Group geo-temporal data
|
||||
description: >
|
||||
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
|
||||
to group data into tracks or routes.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Geo-temporal data
|
||||
weight: 302
|
||||
related:
|
||||
- /{{< latest "flux" >}}/stdlib/experimental/geo/
|
||||
- /{{< latest "flux" >}}/stdlib/experimental/geo/groupbyarea/
|
||||
- /{{< latest "flux" >}}/stdlib/experimental/geo/astracks/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/group-geo-data/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/group-geo-data/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.groupByArea(newColumn: "geoArea", level: 5)
|
||||
|> geo.asTracks(groupBy: ["id"],sortBy: ["_time"])
|
||||
```
|
||||
---
|
||||
|
||||
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
|
||||
to group data into tracks or routes.
|
||||
|
||||
- [Group data by area](#group-data-by-area)
|
||||
- [Group data into tracks or routes](#group-data-by-track-or-route)
|
||||
|
||||
### Group data by area
|
||||
Use the [`geo.groupByArea()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/groupbyarea/)
|
||||
to group geo-temporal data points by geographic area.
|
||||
Areas are determined by [S2 grid cells](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
|
||||
|
||||
- Specify a new column to store the unique area identifier for each point with the `newColumn` parameter.
|
||||
- Specify the [S2 cell level](https://s2geometry.io/resources/s2cell_statistics)
|
||||
to use when calculating geographic areas with the `level` parameter.
|
||||
|
||||
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.10/flux/guides/geo/#sample-data)
|
||||
to query data points within 200km of Cairo, Egypt and group them by geographic area:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|
||||
|> geo.groupByArea(newColumn: "geoArea", level: 5)
|
||||
```
|
||||
|
||||
### Group data by track or route
|
||||
Use [`geo.asTracks()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/astracks/)
|
||||
to group data points into tracks or routes and order them by time or other columns.
|
||||
Data must contain a unique identifier for each track. For example: `id` or `tid`.
|
||||
|
||||
- Specify columns that uniquely identify each track or route with the `groupBy` parameter.
|
||||
- Specify which columns to sort by with the `sortBy` parameter. Default is `["_time"]`.
|
||||
|
||||
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.10/flux/guides/geo/#sample-data)
|
||||
to query data points within 200km of Cairo, Egypt and group them into routes unique
|
||||
to each bird:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|
||||
|> geo.asTracks(groupBy: ["id"], sortBy: ["_time"])
|
||||
```
|
|
@ -0,0 +1,116 @@
|
|||
---
|
||||
title: Shape data to work with the Geo package
|
||||
description: >
|
||||
Functions in the Flux Geo package require **lat** and **lon** fields and an **s2_cell_id** tag.
|
||||
Rename latitude and longitude fields and generate S2 cell ID tokens.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Shape geo-temporal data
|
||||
parent: Geo-temporal data
|
||||
weight: 301
|
||||
related:
|
||||
- /{{< latest "flux" >}}/stdlib/experimental/geo/
|
||||
- /{{< latest "flux" >}}/stdlib/experimental/geo/shapedata/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/shape-geo-data/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/shape-geo-data/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 10)
|
||||
```
|
||||
---
|
||||
|
||||
Functions in the Geo package require the following data schema:
|
||||
|
||||
- an **s2_cell_id** tag containing the [S2 Cell ID](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
|
||||
**as a token**
|
||||
- a **`lat` field** field containing the **latitude in decimal degrees** (WGS 84)
|
||||
- a **`lon` field** field containing the **longitude in decimal degrees** (WGS 84)
|
||||
|
||||
## Shape geo-temporal data
|
||||
If your data already contains latitude and longitude fields, use the
|
||||
[`geo.shapeData()`function](/{{< latest "flux" >}}/stdlib/experimental/geo/shapedata/)
|
||||
to rename the fields to match the requirements of the Geo package, pivot the data
|
||||
into row-wise sets, and generate S2 cell ID tokens for each point.
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
from(bucket: "example-bucket")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement")
|
||||
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 10)
|
||||
```
|
||||
|
||||
## Generate S2 cell ID tokens
|
||||
The Geo package uses the [S2 Geometry Library](https://s2geometry.io/) to represent
|
||||
geographic coordinates on a three-dimensional sphere.
|
||||
The sphere is divided into [cells](https://s2geometry.io/devguide/s2cell_hierarchy),
|
||||
each with a unique 64-bit identifier (S2 cell ID).
|
||||
Grid and S2 cell ID accuracy are defined by a [level](https://s2geometry.io/resources/s2cell_statistics).
|
||||
|
||||
{{% note %}}
|
||||
To filter more quickly, use higher S2 Cell ID levels,
|
||||
but know that that higher levels increase [series cardinality](/enterprise_influxdb/v1.10/concepts/glossary/#series-cardinality).
|
||||
{{% /note %}}
|
||||
|
||||
The Geo package requires S2 cell IDs as tokens.
|
||||
To generate add S2 cell IDs tokens to your data, use one of the following options:
|
||||
|
||||
- [Generate S2 cell ID tokens with Telegraf](#generate-s2-cell-id-tokens-with-telegraf)
|
||||
- [Generate S2 cell ID tokens language-specific libraries](#generate-s2-cell-id-tokens-language-specific-libraries)
|
||||
- [Generate S2 cell ID tokens with Flux](#generate-s2-cell-id-tokens-with-flux)
|
||||
|
||||
### Generate S2 cell ID tokens with Telegraf
|
||||
Enable the [Telegraf S2 Geo (`s2geo`) processor](https://github.com/influxdata/telegraf/tree/master/plugins/processors/s2geo)
|
||||
to generate S2 cell ID tokens at a specified `cell_level` using `lat` and `lon` field values.
|
||||
|
||||
Add the `processors.s2geo` configuration to your Telegraf configuration file (`telegraf.conf`):
|
||||
|
||||
```toml
|
||||
[[processors.s2geo]]
|
||||
## The name of the lat and lon fields containing WGS-84 latitude and
|
||||
## longitude in decimal degrees.
|
||||
lat_field = "lat"
|
||||
lon_field = "lon"
|
||||
|
||||
## New tag to create
|
||||
tag_key = "s2_cell_id"
|
||||
|
||||
## Cell level (see https://s2geometry.io/resources/s2cell_statistics.html)
|
||||
cell_level = 9
|
||||
```
|
||||
|
||||
Telegraf stores the S2 cell ID token in the `s2_cell_id` tag.
|
||||
|
||||
### Generate S2 cell ID tokens language-specific libraries
|
||||
Many programming languages offer S2 Libraries with methods for generating S2 cell ID tokens.
|
||||
Use latitude and longitude with the `s2.CellID.ToToken` endpoint of the S2 Geometry
|
||||
Library to generate `s2_cell_id` tags. For example:
|
||||
|
||||
- **Go:** [s2.CellID.ToToken()](https://godoc.org/github.com/golang/geo/s2#CellID.ToToken)
|
||||
- **Python:** [s2sphere.CellId.to_token()](https://s2sphere.readthedocs.io/en/latest/api.html#s2sphere.CellId)
|
||||
- **JavaScript:** [s2.cellid.toToken()](https://github.com/mapbox/node-s2/blob/master/API.md#cellidtotoken---string)
|
||||
|
||||
### Generate S2 cell ID tokens with Flux
|
||||
Use the [`geo.s2CellIDToken()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/s2cellidtoken/)
|
||||
with existing longitude (`lon`) and latitude (`lat`) field values to generate and add the S2 cell ID token.
|
||||
First, use the [`geo.toRows()` function](/{{< latest "flux" >}}/stdlib/experimental/geo/torows/)
|
||||
to pivot **lat** and **lon** fields into row-wise sets:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
from(bucket: "example-bucket")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement")
|
||||
|> geo.toRows()
|
||||
|> map(fn: (r) => ({r with s2_cell_id: geo.s2CellIDToken(point: {lon: r.lon, lat: r.lat}, level: 10)}))
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
The [`geo.shapeData()`function](/{{< latest "flux" >}}/stdlib/experimental/geo/shapedata/)
|
||||
generates S2 cell ID tokens as well.
|
||||
{{% /note %}}
|
|
@ -0,0 +1,673 @@
|
|||
---
|
||||
title: Group data in InfluxDB with Flux
|
||||
list_title: Group
|
||||
description: >
|
||||
Use the `group()` function to group data with common values in specific columns.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Group
|
||||
parent: Query with Flux
|
||||
weight: 2
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/flux/guides/grouping-data/
|
||||
list_query_example: group
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/group-data/
|
||||
v2: /influxdb/v2.0/query-data/flux/group-data/
|
||||
---
|
||||
|
||||
With Flux, you can group data by any column in your queried data set.
|
||||
"Grouping" partitions data into tables in which each row shares a common value for specified columns.
|
||||
This guide walks through grouping data in Flux and provides examples of how data is shaped in the process.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Group keys
|
||||
Every table has a **group key** – a list of columns which for which every row in the table has the same value.
|
||||
|
||||
###### Example group key
|
||||
```js
|
||||
[_start, _stop, _field, _measurement, host]
|
||||
```
|
||||
|
||||
Grouping data in Flux is essentially defining the group key of output tables.
|
||||
Understanding how modifying group keys shapes output data is key to successfully
|
||||
grouping and transforming data into your desired output.
|
||||
|
||||
## group() Function
|
||||
Flux's [`group()` function](/{{< latest "flux" >}}/stdlib/universe/group) defines the
|
||||
group key for output tables, i.e. grouping records based on values for specific columns.
|
||||
|
||||
###### group() example
|
||||
```js
|
||||
dataStream
|
||||
|> group(columns: ["cpu", "host"])
|
||||
```
|
||||
|
||||
###### Resulting group key
|
||||
```js
|
||||
[cpu, host]
|
||||
```
|
||||
|
||||
The `group()` function has the following parameters:
|
||||
|
||||
### columns
|
||||
The list of columns to include or exclude (depending on the [mode](#mode)) in the grouping operation.
|
||||
|
||||
### mode
|
||||
The method used to define the group and resulting group key.
|
||||
Possible values include `by` and `except`.
|
||||
|
||||
|
||||
## Example grouping operations
|
||||
To illustrate how grouping works, define a `dataSet` variable that queries System
|
||||
CPU usage from the `db/rp` bucket.
|
||||
Filter the `cpu` tag so it only returns results for each numbered CPU core.
|
||||
|
||||
### Data set
|
||||
CPU used by system operations for all numbered CPU cores.
|
||||
It uses a regular expression to filter only numbered cores.
|
||||
|
||||
```js
|
||||
dataSet = from(bucket: "db/rp")
|
||||
|> range(start: -2m)
|
||||
|> filter(fn: (r) => r._field == "usage_system" and r.cpu =~ /cpu[0-9*]/)
|
||||
|> drop(columns: ["host"])
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
This example drops the `host` column from the returned data since the CPU data
|
||||
is only tracked for a single host and it simplifies the output tables.
|
||||
Don't drop the `host` column if monitoring multiple hosts.
|
||||
{{% /note %}}
|
||||
|
||||
{{% truncate %}}
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:00.000000000Z 7.892107892107892
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:10.000000000Z 7.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:20.000000000Z 7.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:30.000000000Z 5.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:40.000000000Z 7.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:50.000000000Z 7.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:00.000000000Z 10.3
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:10.000000000Z 9.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:20.000000000Z 8.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:30.000000000Z 8.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:40.000000000Z 8.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:50.000000000Z 10.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:36:00.000000000Z 10.6
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:00.000000000Z 0.7992007992007992
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:10.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:20.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:30.000000000Z 0.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:40.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:50.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:00.000000000Z 1.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:10.000000000Z 1.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:20.000000000Z 0.8
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:30.000000000Z 0.8991008991008991
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:40.000000000Z 0.8008008008008008
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:50.000000000Z 0.999000999000999
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:36:00.000000000Z 1.1022044088176353
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:00.000000000Z 4.1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:10.000000000Z 3.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:20.000000000Z 3.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:30.000000000Z 2.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:40.000000000Z 4.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:50.000000000Z 4.895104895104895
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:00.000000000Z 6.906906906906907
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:10.000000000Z 5.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:20.000000000Z 5.1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:30.000000000Z 4.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:40.000000000Z 5.1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:50.000000000Z 5.9
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:36:00.000000000Z 6.4935064935064934
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:00.000000000Z 0.5005005005005005
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:10.000000000Z 0.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:20.000000000Z 0.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:30.000000000Z 0.3
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:40.000000000Z 0.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:50.000000000Z 0.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:00.000000000Z 1.3986013986013985
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:10.000000000Z 0.9
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:20.000000000Z 0.5005005005005005
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:30.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:40.000000000Z 0.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:50.000000000Z 0.8
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:36:00.000000000Z 0.9
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
**Note that the group key is output with each table: `Table: keys: <group-key>`.**
|
||||
|
||||
![Group example data set](/img/flux/grouping-data-set.png)
|
||||
|
||||
### Group by CPU
|
||||
Group the `dataSet` stream by the `cpu` column.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["cpu"])
|
||||
```
|
||||
|
||||
This won't actually change the structure of the data since it already has `cpu`
|
||||
in the group key and is therefore grouped by `cpu`.
|
||||
However, notice that it does change the group key:
|
||||
|
||||
{{% truncate %}}
|
||||
###### Group by CPU output tables
|
||||
```
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 7.892107892107892 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 7.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 5.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 7.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 10.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 9.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 8.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 8.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 8.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 10.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 0.7992007992007992 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 0.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 1.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 1.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 0.8991008991008991 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 0.8008008008008008 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 0.999000999000999 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 4.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 3.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 3.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 2.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 4.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 4.895104895104895 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 6.906906906906907 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 5.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 4.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 5.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 0.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 1.3986013986013985 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
The visualization remains the same.
|
||||
|
||||
![Group by CPU](/img/flux/grouping-data-set.png)
|
||||
|
||||
### Group by time
|
||||
Grouping data by the `_time` column is a good illustration of how grouping changes the structure of your data.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["_time"])
|
||||
```
|
||||
|
||||
When grouping by `_time`, all records that share a common `_time` value are grouped into individual tables.
|
||||
So each output table represents a single point in time.
|
||||
|
||||
{{% truncate %}}
|
||||
###### Group by time output tables
|
||||
```
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.892107892107892 usage_system cpu cpu0
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7992007992007992 usage_system cpu cpu1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.1 usage_system cpu cpu2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.2 usage_system cpu cpu0
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 3.6 usage_system cpu cpu2
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu cpu0
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 3.5 usage_system cpu cpu2
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.5 usage_system cpu cpu0
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.4 usage_system cpu cpu1
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 2.6 usage_system cpu cpu2
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.3 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu cpu0
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.5 usage_system cpu cpu2
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.5 usage_system cpu cpu0
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.895104895104895 usage_system cpu cpu2
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.3 usage_system cpu cpu0
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.4 usage_system cpu cpu1
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.906906906906907 usage_system cpu cpu2
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.3986013986013985 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 9.2 usage_system cpu cpu0
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.2 usage_system cpu cpu1
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.7 usage_system cpu cpu2
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.4 usage_system cpu cpu0
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu cpu1
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu cpu2
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.5 usage_system cpu cpu0
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8991008991008991 usage_system cpu cpu1
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.7 usage_system cpu cpu2
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.6 usage_system cpu cpu0
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8008008008008008 usage_system cpu cpu1
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu cpu2
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.2 usage_system cpu cpu0
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.999000999000999 usage_system cpu cpu1
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.9 usage_system cpu cpu2
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu cpu0
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu cpu1
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu cpu2
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu cpu3
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
Because each timestamp is structured as a separate table, when visualized, all
|
||||
points that share the same timestamp appear connected.
|
||||
|
||||
![Group by time](/img/flux/grouping-by-time.png)
|
||||
|
||||
{{% note %}}
|
||||
With some further processing, you could calculate the average CPU usage across all CPUs per point
|
||||
of time and group them into a single table, but we won't cover that in this example.
|
||||
If you're interested in running and visualizing this yourself, here's what the query would look like:
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["_time"])
|
||||
|> mean()
|
||||
|> group(columns: ["_value", "_time"], mode: "except")
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
## Group by CPU and time
|
||||
Group by the `cpu` and `_time` columns.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["cpu", "_time"])
|
||||
```
|
||||
|
||||
This outputs a table for every unique `cpu` and `_time` combination:
|
||||
|
||||
{{% truncate %}}
|
||||
###### Group by CPU and time output tables
|
||||
```
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.892107892107892 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7992007992007992 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 3.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 3.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 5.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 2.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.895104895104895 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 6.906906906906907 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 1.3986013986013985 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 9.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8991008991008991 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8008008008008008 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.999000999000999 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
When visualized, tables appear as individual, unconnected points.
|
||||
|
||||
![Group by CPU and time](/img/flux/grouping-by-cpu-time.png)
|
||||
|
||||
Grouping by `cpu` and `_time` is a good illustration of how grouping works.
|
||||
|
||||
## In conclusion
|
||||
Grouping is a powerful way to shape your data into your desired output format.
|
||||
It modifies the group keys of output tables, grouping records into tables that
|
||||
all share common values within specified columns.
|
|
@ -0,0 +1,119 @@
|
|||
---
|
||||
title: Create histograms with Flux
|
||||
list_title: Histograms
|
||||
description: >
|
||||
Use the `histogram()` function to create cumulative histograms with Flux.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Histograms
|
||||
parent: Query with Flux
|
||||
weight: 10
|
||||
list_query_example: histogram
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/histograms/
|
||||
v2: /influxdb/v2.0/query-data/flux/histograms/
|
||||
---
|
||||
|
||||
Histograms provide valuable insight into the distribution of your data.
|
||||
This guide walks through using Flux's `histogram()` function to transform your data into a **cumulative histogram**.
|
||||
|
||||
## histogram() function
|
||||
The [`histogram()` function](/{{< latest "flux" >}}/stdlib/universe/histogram) approximates the
|
||||
cumulative distribution of a dataset by counting data frequencies for a list of "bins."
|
||||
A **bin** is simply a range in which a data point falls.
|
||||
All data points that are less than or equal to the bound are counted in the bin.
|
||||
In the histogram output, a column is added (le) that represents the upper bounds of of each bin.
|
||||
Bin counts are cumulative.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> histogram(bins: [0.0, 10.0, 20.0, 30.0])
|
||||
```
|
||||
|
||||
> Values output by the `histogram` function represent points of data aggregated over time.
|
||||
> Since values do not represent single points in time, there is no `_time` column in the output table.
|
||||
|
||||
## Bin helper functions
|
||||
Flux provides two helper functions for generating histogram bins.
|
||||
Each generates and outputs an array of floats designed to be used in the `histogram()` function's `bins` parameter.
|
||||
|
||||
### linearBins()
|
||||
The [`linearBins()` function](/{{< latest "flux" >}}/stdlib/built-in/misc/linearbins) generates a list of linearly separated floats.
|
||||
|
||||
```js
|
||||
linearBins(start: 0.0, width: 10.0, count: 10)
|
||||
|
||||
// Generated list: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, +Inf]
|
||||
```
|
||||
|
||||
### logarithmicBins()
|
||||
The [`logarithmicBins()` function](/{{< latest "flux" >}}/stdlib/built-in/misc/logarithmicbins) generates a list of exponentially separated floats.
|
||||
|
||||
```js
|
||||
logarithmicBins(start: 1.0, factor: 2.0, count: 10, infinty: true)
|
||||
|
||||
// Generated list: [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, +Inf]
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Generating a histogram with linear bins
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> histogram(bins: linearBins(start: 65.5, width: 0.5, count: 20, infinity: false))
|
||||
```
|
||||
|
||||
###### Output table
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string le:float _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ---------------------------- ----------------------------
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 65.5 5
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 66 6
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 66.5 8
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 67 9
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 67.5 9
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 68 10
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 68.5 12
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 69 12
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 69.5 15
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 70 23
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 70.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 71 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 71.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 72 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 72.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 73 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 73.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 74 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 74.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 75 30
|
||||
```
|
||||
|
||||
### Generating a histogram with logarithmic bins
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> histogram(bins: logarithmicBins(start: 0.5, factor: 2.0, count: 10, infinity: false))
|
||||
```
|
||||
|
||||
###### Output table
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string le:float _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ---------------------------- ----------------------------
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 0.5 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 1 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 2 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 4 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 8 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 16 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 32 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 64 2
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 128 30
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 256 30
|
||||
```
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
title: Calculate the increase
|
||||
seotitle: Calculate the increase in Flux
|
||||
list_title: Increase
|
||||
description: >
|
||||
Use the `increase()` function to track increases across multiple columns in a table.
|
||||
This function is especially useful when tracking changes in counter values that
|
||||
wrap over time or periodically reset.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: Increase
|
||||
list_query_example: increase
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/increase/
|
||||
v2: /influxdb/v2.0/query-data/flux/increase/
|
||||
---
|
||||
|
||||
Use the [`increase()` function](/{{< latest "flux" >}}/stdlib/universe/increase/)
|
||||
to track increases across multiple columns in a table.
|
||||
This function is especially useful when tracking changes in counter values that
|
||||
wrap over time or periodically reset.
|
||||
|
||||
```js
|
||||
data
|
||||
|> increase()
|
||||
```
|
||||
|
||||
`increase()` returns a cumulative sum of **non-negative** differences between rows in a table.
|
||||
For example:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1 |
|
||||
| 2020-01-01T00:02:00Z | 2 |
|
||||
| 2020-01-01T00:03:00Z | 8 |
|
||||
| 2020-01-01T00:04:00Z | 10 |
|
||||
| 2020-01-01T00:05:00Z | 0 |
|
||||
| 2020-01-01T00:06:00Z | 4 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`increase()` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1 |
|
||||
| 2020-01-01T00:03:00Z | 7 |
|
||||
| 2020-01-01T00:04:00Z | 9 |
|
||||
| 2020-01-01T00:05:00Z | 9 |
|
||||
| 2020-01-01T00:06:00Z | 13 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,285 @@
|
|||
---
|
||||
title: Join data with Flux
|
||||
seotitle: Join data in InfluxDB with Flux
|
||||
list_title: Join
|
||||
description: This guide walks through joining data with Flux and outlines how it shapes your data in the process.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Join
|
||||
parent: Query with Flux
|
||||
weight: 10
|
||||
list_query_example: join
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/join/
|
||||
v2: /influxdb/v2.0/query-data/flux/join/
|
||||
---
|
||||
|
||||
The [`join()` function](/{{< latest "flux" >}}/stdlib/universe/join) merges two or more
|
||||
input streams, whose values are equal on a set of common columns, into a single output stream.
|
||||
Flux allows you to join on any columns common between two data streams and opens the door
|
||||
for operations such as cross-measurement joins and math across measurements.
|
||||
|
||||
To illustrate a join operation, use data captured by Telegraf and and stored in
|
||||
InfluxDB - memory usage and processes.
|
||||
|
||||
In this guide, we'll join two data streams, one representing memory usage and the other representing the
|
||||
total number of running processes, then calculate the average memory usage per running process.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Define stream variables
|
||||
In order to perform a join, you must have two streams of data.
|
||||
Assign a variable to each data stream.
|
||||
|
||||
### Memory used variable
|
||||
Define a `memUsed` variable that filters on the `mem` measurement and the `used` field.
|
||||
This returns the amount of memory (in bytes) used.
|
||||
|
||||
###### memUsed stream definition
|
||||
```js
|
||||
memUsed = from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used")
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### memUsed data output
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string _time:time _value:int
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ------------------------------ --------------------------
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:00.000000000Z 10956333056
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:10.000000000Z 11014008832
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:20.000000000Z 11373428736
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:30.000000000Z 11001421824
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:40.000000000Z 10985852928
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:50.000000000Z 10992279552
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:00.000000000Z 11053568000
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:10.000000000Z 11092242432
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:20.000000000Z 11612774400
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:30.000000000Z 11131961344
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:40.000000000Z 11124805632
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:50.000000000Z 11332464640
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:00.000000000Z 11176923136
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:10.000000000Z 11181068288
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:20.000000000Z 11182579712
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:30.000000000Z 11238862848
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:40.000000000Z 11275296768
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:50.000000000Z 11225411584
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:00.000000000Z 11252690944
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:10.000000000Z 11227029504
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:20.000000000Z 11201646592
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:30.000000000Z 11227897856
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:40.000000000Z 11330428928
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:50.000000000Z 11347976192
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:00.000000000Z 11368271872
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:10.000000000Z 11269623808
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:20.000000000Z 11295637504
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:30.000000000Z 11354423296
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:40.000000000Z 11379687424
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:50.000000000Z 11248926720
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:55:00.000000000Z 11292524544
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
### Total processes variable
|
||||
Define a `procTotal` variable that filters on the `processes` measurement and the `total` field.
|
||||
This returns the number of running processes.
|
||||
|
||||
###### procTotal stream definition
|
||||
```js
|
||||
procTotal = from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "processes" and r._field == "total")
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### procTotal data output
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string _time:time _value:int
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ------------------------------ --------------------------
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:00.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:20.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:30.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:40.000000000Z 469
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:00.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:20.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:30.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:40.000000000Z 469
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:00.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:20.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:30.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:40.000000000Z 472
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:00.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:20.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:30.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:40.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:00.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:20.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:30.000000000Z 473
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:40.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:55:00.000000000Z 471
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
## Join the two data streams
|
||||
With the two data streams defined, use the `join()` function to join them together.
|
||||
`join()` requires two parameters:
|
||||
|
||||
##### `tables`
|
||||
A map of tables to join with keys by which they will be aliased.
|
||||
In the example below, `mem` is the alias for `memUsed` and `proc` is the alias for `procTotal`.
|
||||
|
||||
##### `on`
|
||||
An array of strings defining the columns on which the tables will be joined.
|
||||
_**Both tables must have all columns specified in this list.**_
|
||||
|
||||
```js
|
||||
join(tables: {mem: memUsed, proc: procTotal}, on: ["_time", "_stop", "_start", "host"])
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### Joined output table
|
||||
```
|
||||
Table: keys: [_field_mem, _field_proc, _measurement_mem, _measurement_proc, _start, _stop, host]
|
||||
_field_mem:string _field_proc:string _measurement_mem:string _measurement_proc:string _start:time _stop:time host:string _time:time _value_mem:int _value_proc:int
|
||||
---------------------- ---------------------- ----------------------- ------------------------ ------------------------------ ------------------------------ ------------------------ ------------------------------ -------------------------- --------------------------
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:00.000000000Z 10956333056 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:10.000000000Z 11014008832 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:20.000000000Z 11373428736 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:30.000000000Z 11001421824 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:40.000000000Z 10985852928 469
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:50.000000000Z 10992279552 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:00.000000000Z 11053568000 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:10.000000000Z 11092242432 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:20.000000000Z 11612774400 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:30.000000000Z 11131961344 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:40.000000000Z 11124805632 469
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:50.000000000Z 11332464640 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:00.000000000Z 11176923136 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:10.000000000Z 11181068288 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:20.000000000Z 11182579712 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:30.000000000Z 11238862848 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:40.000000000Z 11275296768 472
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:50.000000000Z 11225411584 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:00.000000000Z 11252690944 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:10.000000000Z 11227029504 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:20.000000000Z 11201646592 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:30.000000000Z 11227897856 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:40.000000000Z 11330428928 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:50.000000000Z 11347976192 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:00.000000000Z 11368271872 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:10.000000000Z 11269623808 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:20.000000000Z 11295637504 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:30.000000000Z 11354423296 473
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:40.000000000Z 11379687424 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:50.000000000Z 11248926720 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:55:00.000000000Z 11292524544 471
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
Notice the output table includes the following columns:
|
||||
|
||||
- `_field_mem`
|
||||
- `_field_proc`
|
||||
- `_measurement_mem`
|
||||
- `_measurement_proc`
|
||||
- `_value_mem`
|
||||
- `_value_proc`
|
||||
|
||||
These represent the columns with values unique to the two input tables.
|
||||
|
||||
## Calculate and create a new table
|
||||
With the two streams of data joined into a single table, use the
|
||||
[`map()` function](/{{< latest "flux" >}}/stdlib/universe/map)
|
||||
to build a new table by mapping the existing `_time` column to a new `_time`
|
||||
column and dividing `_value_mem` by `_value_proc` and mapping it to a
|
||||
new `_value` column.
|
||||
|
||||
```js
|
||||
join(tables: {mem: memUsed, proc: procTotal}, on: ["_time", "_stop", "_start", "host"])
|
||||
|> map(fn: (r) => ({_time: r._time, _value: r._value_mem / r._value_proc}))
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### Mapped table
|
||||
```
|
||||
Table: keys: [_field_mem, _field_proc, _measurement_mem, _measurement_proc, _start, _stop, host]
|
||||
_field_mem:string _field_proc:string _measurement_mem:string _measurement_proc:string _start:time _stop:time host:string _time:time _value:int
|
||||
---------------------- ---------------------- ----------------------- ------------------------ ------------------------------ ------------------------------ ------------------------ ------------------------------ --------------------------
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:00.000000000Z 23311346
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:10.000000000Z 23434061
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:20.000000000Z 24147407
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:30.000000000Z 23407280
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:40.000000000Z 23423993
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:50.000000000Z 23338173
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:00.000000000Z 23518229
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:10.000000000Z 23600515
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:20.000000000Z 24708030
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:30.000000000Z 23685024
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:40.000000000Z 23720267
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:50.000000000Z 24060434
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:00.000000000Z 23730197
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:10.000000000Z 23789506
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:20.000000000Z 23792722
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:30.000000000Z 23861704
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:40.000000000Z 23888340
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:50.000000000Z 23833145
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:00.000000000Z 23941895
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:10.000000000Z 23887296
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:20.000000000Z 23833290
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:30.000000000Z 23838424
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:40.000000000Z 24056112
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:50.000000000Z 24093367
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:00.000000000Z 24136458
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:10.000000000Z 23977922
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:20.000000000Z 23982245
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:30.000000000Z 24005123
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:40.000000000Z 24160695
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:50.000000000Z 23883071
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:55:00.000000000Z 23975635
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
This table represents the average amount of memory in bytes per running process.
|
||||
|
||||
|
||||
## Real world example
|
||||
The following function calculates the batch sizes written to an InfluxDB cluster by joining
|
||||
fields from `httpd` and `write` measurements in order to compare `pointReq` and `writeReq`.
|
||||
The results are grouped by cluster ID so you can make comparisons across clusters.
|
||||
|
||||
```js
|
||||
batchSize = (cluster_id, start=-1m, interval=10s) => {
|
||||
httpd = from(bucket: "telegraf")
|
||||
|> range(start: start)
|
||||
|> filter(fn: (r) => r._measurement == "influxdb_httpd" and r._field == "writeReq" and r.cluster_id == cluster_id)
|
||||
|> aggregateWindow(every: interval, fn: mean)
|
||||
|> derivative(nonNegative: true, unit: 60s)
|
||||
|
||||
write = from(bucket: "telegraf")
|
||||
|> range(start: start)
|
||||
|> filter(fn: (r) => r._measurement == "influxdb_write" and r._field == "pointReq" and r.cluster_id == cluster_id)
|
||||
|> aggregateWindow(every: interval, fn: max)
|
||||
|> derivative(nonNegative: true, unit: 60s)
|
||||
|
||||
return join(tables: {httpd: httpd, write: write}, on: ["_time", "_stop", "_start", "host"])
|
||||
|> map(fn: (r) => ({_time: r._time, _value: r._value_httpd / r._value_write}))
|
||||
|> group(columns: cluster_id)
|
||||
}
|
||||
|
||||
batchSize(cluster_id: "enter cluster id here")
|
||||
```
|
|
@ -0,0 +1,183 @@
|
|||
---
|
||||
title: Manipulate timestamps with Flux
|
||||
list_title: Manipulate timestamps
|
||||
description: >
|
||||
Use Flux to process and manipulate timestamps.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Manipulate timestamps
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/manipulate-timestamps/
|
||||
v2: /influxdb/v2.0/query-data/flux/manipulate-timestamps/
|
||||
---
|
||||
|
||||
Every point stored in InfluxDB has an associated timestamp.
|
||||
Use Flux to process and manipulate timestamps to suit your needs.
|
||||
|
||||
- [Convert timestamp format](#convert-timestamp-format)
|
||||
- [Calculate the duration between two timestamps](#calculate-the-duration-between-two-timestamps)
|
||||
- [Retrieve the current time](#retrieve-the-current-time)
|
||||
- [Normalize irregular timestamps](#normalize-irregular-timestamps)
|
||||
- [Use timestamps and durations together](#use-timestamps-and-durations-together)
|
||||
|
||||
{{% note %}}
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
{{% /note %}}
|
||||
|
||||
|
||||
## Convert timestamp format
|
||||
|
||||
- [Unix nanosecond to RFC3339](#unix-nanosecond-to-rfc3339)
|
||||
- [RFC3339 to Unix nanosecond](#rfc3339-to-unix-nanosecond)
|
||||
|
||||
### Unix nanosecond to RFC3339
|
||||
Use the [`time()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/time/)
|
||||
to convert a [Unix **nanosecond** timestamp](/{{< latest "influxdb" "v2" >}}/reference/glossary/#unix-timestamp)
|
||||
to an [RFC3339 timestamp](/{{< latest "influxdb" "v2" >}}/reference/glossary/#rfc3339-timestamp).
|
||||
|
||||
```js
|
||||
time(v: 1568808000000000000)
|
||||
// Returns 2019-09-18T12:00:00.000000000Z
|
||||
```
|
||||
|
||||
### RFC3339 to Unix nanosecond
|
||||
Use the [`uint()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/uint/)
|
||||
to convert an RFC3339 timestamp to a Unix nanosecond timestamp.
|
||||
|
||||
```js
|
||||
uint(v: 2019-09-18T12:00:00.000000000Z)
|
||||
// Returns 1568808000000000000
|
||||
```
|
||||
|
||||
## Calculate the duration between two timestamps
|
||||
Flux doesn't support mathematical operations using [time type](/{{< latest "flux" >}}/language/types/#time-types) values.
|
||||
To calculate the duration between two timestamps:
|
||||
|
||||
1. Use the `uint()` function to convert each timestamp to a Unix nanosecond timestamp.
|
||||
2. Subtract one Unix nanosecond timestamp from the other.
|
||||
3. Use the `duration()` function to convert the result into a duration.
|
||||
|
||||
```js
|
||||
time1 = uint(v: 2019-09-17T21:12:05Z)
|
||||
time2 = uint(v: 2019-09-18T22:16:35Z)
|
||||
|
||||
duration(v: time2 - time1)
|
||||
// Returns 25h4m30s
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Flux doesn't support duration column types.
|
||||
To store a duration in a column, use the [`string()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/string/)
|
||||
to convert the duration to a string.
|
||||
{{% /note %}}
|
||||
|
||||
## Retrieve the current time
|
||||
- [Current UTC time](#current-utc-time)
|
||||
- [Current system time](#current-system-time)
|
||||
|
||||
### Current UTC time
|
||||
Use the [`now()` function](/{{< latest "flux" >}}/stdlib/built-in/misc/now/) to
|
||||
return the current UTC time in RFC3339 format.
|
||||
|
||||
```js
|
||||
now()
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
`now()` is cached at runtime, so all instances of `now()` in a Flux script
|
||||
return the same value.
|
||||
{{% /note %}}
|
||||
|
||||
### Current system time
|
||||
Import the `system` package and use the [`system.time()` function](/{{< latest "flux" >}}/stdlib/system/time/)
|
||||
to return the current system time of the host machine in RFC3339 format.
|
||||
|
||||
```js
|
||||
import "system"
|
||||
|
||||
system.time()
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
`system.time()` returns the time it is executed, so each instance of `system.time()`
|
||||
in a Flux script returns a unique value.
|
||||
{{% /note %}}
|
||||
|
||||
## Normalize irregular timestamps
|
||||
To normalize irregular timestamps, truncate all `_time` values to a specified unit
|
||||
with the [`truncateTimeColumn()` function](/{{< latest "flux" >}}/stdlib/universe/truncatetimecolumn/).
|
||||
This is useful in [`join()`](/{{< latest "flux" >}}/stdlib/universe/join/)
|
||||
and [`pivot()`](/{{< latest "flux" >}}/stdlib/universe/pivot/)
|
||||
operations where points should align by time, but timestamps vary slightly.
|
||||
|
||||
```js
|
||||
data
|
||||
|> truncateTimeColumn(unit: 1m)
|
||||
```
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:49Z | 2.0 |
|
||||
| 2020-01-01T00:01:01Z | 1.9 |
|
||||
| 2020-01-01T00:03:22Z | 1.8 |
|
||||
| 2020-01-01T00:04:04Z | 1.9 |
|
||||
| 2020-01-01T00:05:38Z | 2.1 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**Output:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 2.0 |
|
||||
| 2020-01-01T00:01:00Z | 1.9 |
|
||||
| 2020-01-01T00:03:00Z | 1.8 |
|
||||
| 2020-01-01T00:04:00Z | 1.9 |
|
||||
| 2020-01-01T00:05:00Z | 2.1 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
## Use timestamps and durations together
|
||||
- [Add a duration to a timestamp](#add-a-duration-to-a-timestamp)
|
||||
- [Subtract a duration from a timestamp](#subtract-a-duration-from-a-timestamp)
|
||||
|
||||
### Add a duration to a timestamp
|
||||
The [`experimental.addDuration()` function](/{{< latest "flux" >}}/stdlib/experimental/addduration/)
|
||||
adds a duration to a specified time and returns the resulting time.
|
||||
|
||||
{{% warn %}}
|
||||
By using `experimental.addDuration()`, you accept the
|
||||
[risks of experimental functions](/{{< latest "flux" >}}/stdlib/experimental/#experimental-functions-are-subject-to-change).
|
||||
{{% /warn %}}
|
||||
|
||||
```js
|
||||
import "experimental"
|
||||
|
||||
experimental.addDuration(d: 6h, to: 2019-09-16T12:00:00Z)
|
||||
|
||||
// Returns 2019-09-16T18:00:00.000000000Z
|
||||
```
|
||||
|
||||
### Subtract a duration from a timestamp
|
||||
The [`experimental.subDuration()` function](/{{< latest "flux" >}}/stdlib/experimental/subduration/)
|
||||
subtracts a duration from a specified time and returns the resulting time.
|
||||
|
||||
{{% warn %}}
|
||||
By using `experimental.subDuration()`, you accept the
|
||||
[risks of experimental functions](/{{< latest "flux" >}}/stdlib/experimental/#experimental-functions-are-subject-to-change).
|
||||
{{% /warn %}}
|
||||
|
||||
```js
|
||||
import "experimental"
|
||||
|
||||
experimental.subDuration(d: 6h, from: 2019-09-16T12:00:00Z)
|
||||
|
||||
// Returns 2019-09-16T06:00:00.000000000Z
|
||||
```
|
|
@ -0,0 +1,200 @@
|
|||
---
|
||||
title: Transform data with mathematic operations
|
||||
seotitle: Transform data with mathematic operations in Flux
|
||||
list_title: Transform data with math
|
||||
description: >
|
||||
Use the `map()` function to remap column values and apply mathematic operations.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Transform data with math
|
||||
parent: Query with Flux
|
||||
weight: 5
|
||||
list_query_example: map_math
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/mathematic-operations/
|
||||
v2: /influxdb/v2.0/query-data/flux/mathematic-operations/
|
||||
---
|
||||
|
||||
Flux supports mathematic expressions in data transformations.
|
||||
This article describes how to use [Flux arithmetic operators](/{{< latest "flux" >}}/language/operators/#arithmetic-operators)
|
||||
to "map" over data and transform values using mathematic operations.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
##### Basic mathematic operations
|
||||
```js
|
||||
// Examples executed using the Flux REPL
|
||||
> 9 + 9
|
||||
18
|
||||
> 22 - 14
|
||||
8
|
||||
> 6 * 5
|
||||
30
|
||||
> 21 / 7
|
||||
3
|
||||
```
|
||||
|
||||
<p style="font-size:.85rem;font-style:italic;margin-top:-2rem;">See <a href="/influxdb/v2.0/tools/repl/">Flux read-eval-print-loop (REPL)</a>.</p>
|
||||
|
||||
{{% note %}}
|
||||
#### Operands must be the same type
|
||||
Operands in Flux mathematic operations must be the same data type.
|
||||
For example, integers cannot be used in operations with floats.
|
||||
Otherwise, you will get an error similar to:
|
||||
|
||||
```
|
||||
Error: type error: float != int
|
||||
```
|
||||
|
||||
To convert operands to the same type, use [type-conversion functions](/{{< latest "flux" >}}/stdlib/universe/type-conversions/)
|
||||
or manually format operands.
|
||||
The operand data type determines the output data type.
|
||||
For example:
|
||||
|
||||
```js
|
||||
100 // Parsed as an integer
|
||||
100.0 // Parsed as a float
|
||||
|
||||
// Example evaluations
|
||||
> 20 / 8
|
||||
2
|
||||
|
||||
> 20.0 / 8.0
|
||||
2.5
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
## Custom mathematic functions
|
||||
Flux lets you [create custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) that use mathematic operations.
|
||||
View the examples below.
|
||||
|
||||
###### Custom multiplication function
|
||||
```js
|
||||
multiply = (x, y) => x * y
|
||||
|
||||
multiply(x: 10, y: 12)
|
||||
// Returns 120
|
||||
```
|
||||
|
||||
###### Custom percentage function
|
||||
```js
|
||||
percent = (sample, total) => (sample / total) * 100.0
|
||||
|
||||
percent(sample: 20.0, total: 80.0)
|
||||
// Returns 25.0
|
||||
```
|
||||
|
||||
### Transform values in a data stream
|
||||
To transform multiple values in an input stream, your function needs to:
|
||||
|
||||
- [Handle piped-forward data](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions/#functions-that-manipulate-piped-forward-data).
|
||||
- Each operand necessary for the calculation exists in each row _(see [Pivot vs join](#pivot-vs-join) below)_.
|
||||
- Use the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map) to iterate over each row.
|
||||
|
||||
The example `multiplyByX()` function below includes:
|
||||
|
||||
- A `tables` parameter that represents the input data stream (`<-`).
|
||||
- An `x` parameter which is the number by which values in the `_value` column are multiplied.
|
||||
- A `map()` function that iterates over each row in the input stream.
|
||||
It uses the `with` operator to preserve existing columns in each row.
|
||||
It also multiples the `_value` column by `x`.
|
||||
|
||||
```js
|
||||
multiplyByX = (x, tables=<-) => tables
|
||||
|> map(fn: (r) => ({r with _value: r._value * x}))
|
||||
|
||||
data
|
||||
|> multiplyByX(x: 10)
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Convert bytes to gigabytes
|
||||
To convert active memory from bytes to gigabytes (GB), divide the `active` field
|
||||
in the `mem` measurement by 1,073,741,824.
|
||||
|
||||
The `map()` function iterates over each row in the piped-forward data and defines
|
||||
a new `_value` by dividing the original `_value` by 1073741824.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -10m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "active")
|
||||
|> map(fn: (r) => ({r with _value: r._value / 1073741824}))
|
||||
```
|
||||
|
||||
You could turn that same calculation into a function:
|
||||
|
||||
```js
|
||||
bytesToGB = (tables=<-) => tables
|
||||
|> map(fn: (r) => ({r with _value: r._value / 1073741824}))
|
||||
|
||||
data
|
||||
|> bytesToGB()
|
||||
```
|
||||
|
||||
#### Include partial gigabytes
|
||||
Because the original metric (bytes) is an integer, the output of the operation is an integer and does not include partial GBs.
|
||||
To calculate partial GBs, convert the `_value` column and its values to floats using the
|
||||
[`float()` function](/{{< latest "flux" >}}/stdlib/universe/type-conversions/float)
|
||||
and format the denominator in the division operation as a float.
|
||||
|
||||
```js
|
||||
bytesToGB = (tables=<-) => tables
|
||||
|> map(fn: (r) => ({r with _value: float(v: r._value) / 1073741824.0}))
|
||||
```
|
||||
|
||||
### Calculate a percentage
|
||||
To calculate a percentage, use simple division, then multiply the result by 100.
|
||||
|
||||
```js
|
||||
> 1.0 / 4.0 * 100.0
|
||||
25.0
|
||||
```
|
||||
|
||||
_For an in-depth look at calculating percentages, see [Calculate percentates](/enterprise_influxdb/v1.10/flux/guides/calculate-percentages)._
|
||||
|
||||
## Pivot vs join
|
||||
To query and use values in mathematical operations in Flux, operand values must
|
||||
exists in a single row.
|
||||
Both `pivot()` and `join()` will do this, but there are important differences between the two:
|
||||
|
||||
#### Pivot is more performant
|
||||
`pivot()` reads and operates on a single stream of data.
|
||||
`join()` requires two streams of data and the overhead of reading and combining
|
||||
both streams can be significant, especially for larger data sets.
|
||||
|
||||
#### Use join for multiple data sources
|
||||
Use `join()` when querying data from different buckets or data sources.
|
||||
|
||||
##### Pivot fields into columns for mathematic calculations
|
||||
```js
|
||||
data
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({r with _value: (r.field1 + r.field2) / r.field3 * 100.0}))
|
||||
```
|
||||
|
||||
##### Join multiple data sources for mathematic calculations
|
||||
```js
|
||||
import "sql"
|
||||
import "influxdata/influxdb/secrets"
|
||||
|
||||
pgUser = secrets.get(key: "POSTGRES_USER")
|
||||
pgPass = secrets.get(key: "POSTGRES_PASSWORD")
|
||||
pgHost = secrets.get(key: "POSTGRES_HOST")
|
||||
|
||||
t1 = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://${pgUser}:${pgPass}@${pgHost}",
|
||||
query: "SELECT id, name, available FROM exampleTable",
|
||||
)
|
||||
|
||||
t2 = from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement" and r._field == "example-field")
|
||||
|
||||
join(tables: {t1: t1, t2: t2}, on: ["id"])
|
||||
|> map(fn: (r) => ({r with _value: r._value_t2 / r.available_t1 * 100.0}))
|
||||
```
|
|
@ -0,0 +1,143 @@
|
|||
---
|
||||
title: Find median values
|
||||
seotitle: Find median values in Flux
|
||||
list_title: Median
|
||||
description: >
|
||||
Use the `median()` function to return a value representing the `0.5` quantile (50th percentile) or median of input data.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: Median
|
||||
list_query_example: median
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/median/
|
||||
v2: /influxdb/v2.0/query-data/flux/median/
|
||||
---
|
||||
|
||||
Use the [`median()` function](/{{< latest "flux" >}}/stdlib/universe/median/)
|
||||
to return a value representing the `0.5` quantile (50th percentile) or median of input data.
|
||||
|
||||
## Select a method for calculating the median
|
||||
Select one of the following methods to calculate the median:
|
||||
|
||||
- [estimate_tdigest](#estimate-tdigest)
|
||||
- [exact_mean](#exact-mean)
|
||||
- [exact_selector](#exact-selector)
|
||||
|
||||
### estimate_tdigest
|
||||
**(Default)** An aggregate method that uses a [t-digest data structure](https://github.com/tdunning/t-digest)
|
||||
to compute an accurate `0.5` quantile estimate on large data sources.
|
||||
Output tables consist of a single row containing the calculated median.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`estimate_tdigest` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_mean
|
||||
An aggregate method that takes the average of the two points closest to the `0.5` quantile value.
|
||||
Output tables consist of a single row containing the calculated median.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_mean` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_selector
|
||||
A selector method that returns the data point for which at least 50% of points are less than.
|
||||
Output tables consist of a single row containing the calculated median.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_selector` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
{{% note %}}
|
||||
The examples below use the [example data variable](/enterprise_influxdb/v1.10/flux/guides/#example-data-variable).
|
||||
{{% /note %}}
|
||||
|
||||
## Find the value that represents the median
|
||||
Use the default method, `"estimate_tdigest"`, to return all rows in a table that
|
||||
contain values in the 50th percentile of data in the table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> median()
|
||||
```
|
||||
|
||||
## Find the average of values closest to the median
|
||||
Use the `exact_mean` method to return a single row per input table containing the
|
||||
average of the two values closest to the mathematical median of data in the table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> median(method: "exact_mean")
|
||||
```
|
||||
|
||||
## Find the point with the median value
|
||||
Use the `exact_selector` method to return a single row per input table containing the
|
||||
value that 50% of values in the table are less than.
|
||||
|
||||
```js
|
||||
data
|
||||
|> median(method: "exact_selector")
|
||||
```
|
||||
|
||||
## Use median() with aggregateWindow()
|
||||
[`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
|
||||
segments data into windows of time, aggregates data in each window into a single
|
||||
point, and then removes the time-based segmentation.
|
||||
It is primarily used to downsample data.
|
||||
|
||||
To specify the [median calculation method](#select-a-method-for-calculating-the-median) in `aggregateWindow()`, use the
|
||||
[full function syntax](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/#specify-parameters-of-the-aggregate-function):
|
||||
|
||||
```js
|
||||
data
|
||||
|> aggregateWindow(every: 5m, fn: (tables=<-, column) => tables |> median(method: "exact_selector"))
|
||||
```
|
|
@ -0,0 +1,143 @@
|
|||
---
|
||||
title: Monitor states
|
||||
seotitle: Monitor states and state changes in your events and metrics with Flux.
|
||||
description: Flux provides several functions to help monitor states and state changes in your data.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Monitor states
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/monitor-states/
|
||||
v2: /influxdb/v2.0/query-data/flux/monitor-states/
|
||||
---
|
||||
|
||||
Flux helps you monitor states in your metrics and events:
|
||||
|
||||
- [Find how long a state persists](#find-how-long-a-state-persists)
|
||||
- [Count the number of consecutive states](#count-the-number-of-consecutive-states)
|
||||
- [Detect state changes](#example-query-to-count-machine-state)
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/executing-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Find how long a state persists
|
||||
|
||||
1. Use the [`stateDuration()`](/{{< latest "flux" >}}/stdlib/universe/stateduration/) function to calculate how long a column value has remained the same value (or state). Include the following information:
|
||||
|
||||
- **Column to search:** any tag key, tag value, field key, field value, or measurement.
|
||||
- **Value:** the value (or state) to search for in the specified column.
|
||||
- **State duration column:** a new column to store the state duration─the length of time that the specified value persists.
|
||||
- **Unit:** the unit of time (`1s` (by default), `1m`, `1h`) used to increment the state duration.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
data
|
||||
|> stateDuration(
|
||||
fn: (r) => r._column_to_search == "value_to_search_for",
|
||||
column: "state_duration",
|
||||
unit: 1s,
|
||||
)
|
||||
```
|
||||
|
||||
2. Use `stateDuration()` to search each point for the specified value:
|
||||
|
||||
- For the first point that evaluates `true`, the state duration is set to `0`. For each consecutive point that evaluates `true`, the state duration increases by the time interval between each consecutive point (in specified units).
|
||||
- If the state is `false`, the state duration is reset to `-1`.
|
||||
|
||||
### Example query with stateDuration()
|
||||
|
||||
The following query searches the `doors` bucket over the past 5 minutes to find how many seconds a door has been `closed`.
|
||||
|
||||
```js
|
||||
from(bucket: "doors")
|
||||
|> range(start: -5m)
|
||||
|> stateDuration(
|
||||
fn: (r) => r._value == "closed",
|
||||
column: "door_closed",
|
||||
unit: 1s,
|
||||
)
|
||||
```
|
||||
|
||||
In this example, `door_closed` is the **State duration** column. If you write data to the `doors` bucket every minute, the state duration increases by `60s` for each consecutive point where `_value` is `closed`. If `_value` is not `closed`, the state duration is reset to `0`.
|
||||
|
||||
#### Query results
|
||||
|
||||
Results for the example query above may look like this (for simplicity, we've omitted the measurement, tag, and field columns):
|
||||
|
||||
```bash
|
||||
_time _value door_closed
|
||||
2019-10-26T17:39:16Z closed 0
|
||||
2019-10-26T17:40:16Z closed 60
|
||||
2019-10-26T17:41:16Z closed 120
|
||||
2019-10-26T17:42:16Z open -1
|
||||
2019-10-26T17:43:16Z closed 0
|
||||
2019-10-26T17:44:27Z closed 60
|
||||
```
|
||||
|
||||
## Count the number of consecutive states
|
||||
|
||||
1. Use the `stateCount()` function and include the following information:
|
||||
|
||||
- **Column to search:** any tag key, tag value, field key, field value, or measurement.
|
||||
- **Value:** to search for in the specified column.
|
||||
- **State count column:** a new column to store the state count─the number of consecutive records in which the specified value exists.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
data
|
||||
|> stateCount(
|
||||
fn: (r) => r._column_to_search == "value_to_search_for",
|
||||
column: "state_count"
|
||||
)
|
||||
```
|
||||
|
||||
2. Use `stateCount()` to search each point for the specified value:
|
||||
|
||||
- For the first point that evaluates `true`, the state count is set to `1`. For each consecutive point that evaluates `true`, the state count increases by 1.
|
||||
- If the state is `false`, the state count is reset to `-1`.
|
||||
|
||||
### Example query with stateCount()
|
||||
|
||||
The following query searches the `doors` bucket over the past 5 minutes and
|
||||
calculates how many points have `closed` as their `_value`.
|
||||
|
||||
```js
|
||||
from(bucket: "doors")
|
||||
|> range(start: -5m)
|
||||
|> stateDuration(fn: (r) => r._value == "closed", column: "door_closed")
|
||||
```
|
||||
|
||||
This example stores the **state count** in the `door_closed` column.
|
||||
If you write data to the `doors` bucket every minute, the state count increases
|
||||
by `1` for each consecutive point where `_value` is `closed`.
|
||||
If `_value` is not `closed`, the state count is reset to `-1`.
|
||||
|
||||
#### Query results
|
||||
|
||||
Results for the example query above may look like this (for simplicity, we've omitted the measurement, tag, and field columns):
|
||||
|
||||
```bash
|
||||
_time _value door_closed
|
||||
2019-10-26T17:39:16Z closed 1
|
||||
2019-10-26T17:40:16Z closed 2
|
||||
2019-10-26T17:41:16Z closed 3
|
||||
2019-10-26T17:42:16Z open -1
|
||||
2019-10-26T17:43:16Z closed 1
|
||||
2019-10-26T17:44:27Z closed 2
|
||||
```
|
||||
|
||||
#### Example query to count machine state
|
||||
|
||||
The following query checks the machine state every minute (idle, assigned, or busy).
|
||||
InfluxDB searches the `servers` bucket over the past hour and counts records with a machine state of `idle`, `assigned` or `busy`.
|
||||
|
||||
```js
|
||||
from(bucket: "servers")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r.machine_state == "idle" or r.machine_state == "assigned" or r.machine_state == "busy")
|
||||
|> stateCount(fn: (r) => r.machine_state == "busy", column: "_count")
|
||||
|> stateCount(fn: (r) => r.machine_state == "assigned", column: "_count")
|
||||
|> stateCount(fn: (r) => r.machine_state == "idle", column: "_count")
|
||||
```
|
|
@ -0,0 +1,112 @@
|
|||
---
|
||||
title: Calculate the moving average
|
||||
seotitle: Calculate the moving average in Flux
|
||||
list_title: Moving Average
|
||||
description: >
|
||||
Use the `movingAverage()` or `timedMovingAverage()` functions to return the moving average of data.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: Moving Average
|
||||
list_query_example: moving_average
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/moving-average/
|
||||
v2: /influxdb/v2.0/query-data/flux/moving-average/
|
||||
---
|
||||
|
||||
Use the [`movingAverage()`](/{{< latest "flux" >}}/stdlib/universe/movingaverage/)
|
||||
or [`timedMovingAverage()`](/{{< latest "flux" >}}/stdlib/universe/timedmovingaverage/)
|
||||
functions to return the moving average of data.
|
||||
|
||||
```js
|
||||
data
|
||||
|> movingAverage(n: 5)
|
||||
|
||||
// OR
|
||||
|
||||
data
|
||||
|> timedMovingAverage(every: 5m, period: 10m)
|
||||
```
|
||||
|
||||
### movingAverage()
|
||||
For each row in a table, `movingAverage()` returns the average of the current value and
|
||||
**previous** values where `n` is the total number of values used to calculate the average.
|
||||
|
||||
If `n = 3`:
|
||||
|
||||
| Row # | Calculation |
|
||||
|:-----:|:----------- |
|
||||
| 1 | _Insufficient number of rows_ |
|
||||
| 2 | _Insufficient number of rows_ |
|
||||
| 3 | (Row1 + Row2 + Row3) / 3 |
|
||||
| 4 | (Row2 + Row3 + Row4) / 3 |
|
||||
| 5 | (Row3 + Row4 + Row5) / 3 |
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.2 |
|
||||
| 2020-01-01T00:03:00Z | 1.8 |
|
||||
| 2020-01-01T00:04:00Z | 0.9 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
| 2020-01-01T00:06:00Z | 2.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following would return:**
|
||||
|
||||
```js
|
||||
|> movingAverage(n: 3)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:03:00Z | 1.33 |
|
||||
| 2020-01-01T00:04:00Z | 1.30 |
|
||||
| 2020-01-01T00:05:00Z | 1.36 |
|
||||
| 2020-01-01T00:06:00Z | 1.43 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### timedMovingAverage()
|
||||
For each row in a table, `timedMovingAverage()` returns the average of the
|
||||
current value and all row values in the **previous** `period` (duration).
|
||||
It returns moving averages at a frequency defined by the `every` parameter.
|
||||
|
||||
Each color in the diagram below represents a period of time used to calculate an
|
||||
average and the time a point representing the average is returned.
|
||||
If `every = 30m` and `period = 1h`:
|
||||
|
||||
{{< svg "/static/svgs/timed-moving-avg.svg" >}}
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.2 |
|
||||
| 2020-01-01T00:03:00Z | 1.8 |
|
||||
| 2020-01-01T00:04:00Z | 0.9 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
| 2020-01-01T00:06:00Z | 2.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following would return:**
|
||||
|
||||
```js
|
||||
|> timedMovingAverage(every: 2m, period: 4m)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1.000 |
|
||||
| 2020-01-01T00:04:00Z | 1.333 |
|
||||
| 2020-01-01T00:06:00Z | 1.325 |
|
||||
| 2020-01-01T00:06:00Z | 1.150 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,162 @@
|
|||
---
|
||||
title: Find percentile and quantile values
|
||||
seotitle: Query percentile and quantile values in Flux
|
||||
list_title: Percentile & quantile
|
||||
description: >
|
||||
Use the `quantile()` function to return all values within the `q` quantile or
|
||||
percentile of input data.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: Percentile & quantile
|
||||
list_query_example: quantile
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/percentile-quantile/
|
||||
v2: /influxdb/v2.0/query-data/flux/percentile-quantile/
|
||||
---
|
||||
|
||||
Use the [`quantile()` function](/{{< latest "flux" >}}/stdlib/universe/quantile/)
|
||||
to return a value representing the `q` quantile or percentile of input data.
|
||||
|
||||
## Percentile versus quantile
|
||||
Percentiles and quantiles are very similar, differing only in the number used to calculate return values.
|
||||
A percentile is calculated using numbers between `0` and `100`.
|
||||
A quantile is calculated using numbers between `0.0` and `1.0`.
|
||||
For example, the **`0.5` quantile** is the same as the **50th percentile**.
|
||||
|
||||
## Select a method for calculating the quantile
|
||||
Select one of the following methods to calculate the quantile:
|
||||
|
||||
- [estimate_tdigest](#estimate-tdigest)
|
||||
- [exact_mean](#exact-mean)
|
||||
- [exact_selector](#exact-selector)
|
||||
|
||||
### estimate_tdigest
|
||||
**(Default)** An aggregate method that uses a [t-digest data structure](https://github.com/tdunning/t-digest)
|
||||
to compute a quantile estimate on large data sources.
|
||||
Output tables consist of a single row containing the calculated quantile.
|
||||
|
||||
If calculating the `0.5` quantile or 50th percentile:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`estimate_tdigest` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_mean
|
||||
An aggregate method that takes the average of the two points closest to the quantile value.
|
||||
Output tables consist of a single row containing the calculated quantile.
|
||||
|
||||
If calculating the `0.5` quantile or 50th percentile:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_mean` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_selector
|
||||
A selector method that returns the data point for which at least `q` points are less than.
|
||||
Output tables consist of a single row containing the calculated quantile.
|
||||
|
||||
If calculating the `0.5` quantile or 50th percentile:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_selector` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
{{% note %}}
|
||||
The examples below use the [example data variable](/enterprise_influxdb/v1.10/flux/guides/#example-data-variable).
|
||||
{{% /note %}}
|
||||
|
||||
## Find the value representing the 99th percentile
|
||||
Use the default method, `"estimate_tdigest"`, to return all rows in a table that
|
||||
contain values in the 99th percentile of data in the table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> quantile(q: 0.99)
|
||||
```
|
||||
|
||||
## Find the average of values closest to the quantile
|
||||
Use the `exact_mean` method to return a single row per input table containing the
|
||||
average of the two values closest to the mathematical quantile of data in the table.
|
||||
For example, to calculate the `0.99` quantile:
|
||||
|
||||
```js
|
||||
data
|
||||
|> quantile(q: 0.99, method: "exact_mean")
|
||||
```
|
||||
|
||||
## Find the point with the quantile value
|
||||
Use the `exact_selector` method to return a single row per input table containing the
|
||||
value that `q * 100`% of values in the table are less than.
|
||||
For example, to calculate the `0.99` quantile:
|
||||
|
||||
```js
|
||||
data
|
||||
|> quantile(q: 0.99, method: "exact_selector")
|
||||
```
|
||||
|
||||
## Use quantile() with aggregateWindow()
|
||||
[`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/)
|
||||
segments data into windows of time, aggregates data in each window into a single
|
||||
point, and then removes the time-based segmentation.
|
||||
It is primarily used to downsample data.
|
||||
|
||||
To specify the [quantile calculation method](#select-a-method-for-calculating-the-quantile) in
|
||||
`aggregateWindow()`, use the [full function syntax](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow/#specify-parameters-of-the-aggregate-function):
|
||||
|
||||
```js
|
||||
data
|
||||
|> aggregateWindow(
|
||||
every: 5m,
|
||||
fn: (tables=<-, column) => tables
|
||||
|> quantile(q: 0.99, method: "exact_selector"),
|
||||
)
|
||||
```
|
|
@ -0,0 +1,75 @@
|
|||
---
|
||||
title: Query fields and tags
|
||||
seotitle: Query fields and tags in InfluxDB using Flux
|
||||
description: >
|
||||
Use the `filter()` function to query data based on fields, tags, or any other column value.
|
||||
`filter()` performs operations similar to the `SELECT` statement and the `WHERE`
|
||||
clause in InfluxQL and other SQL-like query languages.
|
||||
weight: 1
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/query-fields/
|
||||
v2: /influxdb/v2.0/query-data/flux/query-fields/
|
||||
list_code_example: |
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "example-measurement" and
|
||||
r._field == "example-field" and
|
||||
r.tag == "example-tag"
|
||||
)
|
||||
```
|
||||
---
|
||||
|
||||
Use the [`filter()` function](/{{< latest "flux" >}}/stdlib/universe/filter/)
|
||||
to query data based on fields, tags, or any other column value.
|
||||
`filter()` performs operations similar to the `SELECT` statement and the `WHERE`
|
||||
clause in InfluxQL and other SQL-like query languages.
|
||||
|
||||
## The filter() function
|
||||
`filter()` has an `fn` parameter that expects a **predicate function**,
|
||||
an anonymous function comprised of one or more **predicate expressions**.
|
||||
The predicate function evaluates each input row.
|
||||
Rows that evaluate to `true` are **included** in the output data.
|
||||
Rows that evaluate to `false` are **excluded** from the output data.
|
||||
|
||||
```js
|
||||
// ...
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement" )
|
||||
```
|
||||
|
||||
The `fn` predicate function requires an `r` argument, which represents each row
|
||||
as `filter()` iterates over input data.
|
||||
Key-value pairs in the row record represent columns and their values.
|
||||
Use **dot notation** or **bracket notation** to reference specific column values in the predicate function.
|
||||
Use [logical operators](/{{< latest "flux" >}}/language/operators/#logical-operators)
|
||||
to chain multiple predicate expressions together.
|
||||
|
||||
```js
|
||||
// Row record
|
||||
r = {foo: "bar", baz: "quz"}
|
||||
|
||||
// Example predicate function
|
||||
(r) => r.foo == "bar" and r["baz"] == "quz"
|
||||
|
||||
// Evaluation results
|
||||
(r) => true and true
|
||||
```
|
||||
|
||||
## Filter by fields and tags
|
||||
The combination of [`from()`](/{{< latest "flux" >}}/stdlib/built-in/inputs/from),
|
||||
[`range()`](/{{< latest "flux" >}}/stdlib/universe/range),
|
||||
and `filter()` represent the most basic Flux query:
|
||||
|
||||
1. Use `from()` to define your [bucket](/enterprise_influxdb/v1.10/flux/get-started/#buckets).
|
||||
2. Use `range()` to limit query results by time.
|
||||
3. Use `filter()` to identify what rows of data to output.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag")
|
||||
|> filter(fn: (r) => r._field == "example-field")
|
||||
```
|
|
@ -0,0 +1,165 @@
|
|||
---
|
||||
title: Calculate the rate of change
|
||||
seotitle: Calculate the rate of change in Flux
|
||||
list_title: Rate
|
||||
description: >
|
||||
Use the `derivative()` function to calculate the rate of change between subsequent values or the
|
||||
`aggregate.rate()` function to calculate the average rate of change per window of time.
|
||||
If time between points varies, these functions normalize points to a common time interval
|
||||
making values easily comparable.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
name: Rate
|
||||
list_query_example: rate_of_change
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/rate/
|
||||
v2: /influxdb/v2.0/query-data/flux/rate/
|
||||
---
|
||||
|
||||
|
||||
Use the [`derivative()` function](/{{< latest "flux" >}}/stdlib/universe/derivative/)
|
||||
to calculate the rate of change between subsequent values or the
|
||||
[`aggregate.rate()` function](/{{< latest "flux" >}}/stdlib/experimental/aggregate/rate/)
|
||||
to calculate the average rate of change per window of time.
|
||||
If time between points varies, these functions normalize points to a common time interval
|
||||
making values easily comparable.
|
||||
|
||||
- [Rate of change between subsequent values](#rate-of-change-between-subsequent-values)
|
||||
- [Average rate of change per window of time](#average-rate-of-change-per-window-of-time)
|
||||
|
||||
## Rate of change between subsequent values
|
||||
Use the [`derivative()` function](/{{< latest "flux" >}}/stdlib/universe/derivative/)
|
||||
to calculate the rate of change per unit of time between subsequent _non-null_ values.
|
||||
|
||||
```js
|
||||
data
|
||||
|> derivative(unit: 1s)
|
||||
```
|
||||
|
||||
By default, `derivative()` returns only positive derivative values and replaces negative values with _null_.
|
||||
Cacluated values are returned as [floats](/{{< latest "flux" >}}/language/types/#numeric-types).
|
||||
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 250 |
|
||||
| 2020-01-01T00:04:00Z | 160 |
|
||||
| 2020-01-01T00:12:00Z | 150 |
|
||||
| 2020-01-01T00:19:00Z | 220 |
|
||||
| 2020-01-01T00:32:00Z | 200 |
|
||||
| 2020-01-01T00:51:00Z | 290 |
|
||||
| 2020-01-01T01:00:00Z | 340 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`derivative(unit: 1m)` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:04:00Z | |
|
||||
| 2020-01-01T00:12:00Z | |
|
||||
| 2020-01-01T00:19:00Z | 10.0 |
|
||||
| 2020-01-01T00:32:00Z | |
|
||||
| 2020-01-01T00:51:00Z | 4.74 |
|
||||
| 2020-01-01T01:00:00Z | 5.56 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
Results represent the rate of change **per minute** between subsequent values with
|
||||
negative values set to _null_.
|
||||
|
||||
### Return negative derivative values
|
||||
To return negative derivative values, set the `nonNegative` parameter to `false`,
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 250 |
|
||||
| 2020-01-01T00:04:00Z | 160 |
|
||||
| 2020-01-01T00:12:00Z | 150 |
|
||||
| 2020-01-01T00:19:00Z | 220 |
|
||||
| 2020-01-01T00:32:00Z | 200 |
|
||||
| 2020-01-01T00:51:00Z | 290 |
|
||||
| 2020-01-01T01:00:00Z | 340 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following returns:**
|
||||
|
||||
```js
|
||||
|> derivative(unit: 1m, nonNegative: false)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:04:00Z | -22.5 |
|
||||
| 2020-01-01T00:12:00Z | -1.25 |
|
||||
| 2020-01-01T00:19:00Z | 10.0 |
|
||||
| 2020-01-01T00:32:00Z | -1.54 |
|
||||
| 2020-01-01T00:51:00Z | 4.74 |
|
||||
| 2020-01-01T01:00:00Z | 5.56 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
Results represent the rate of change **per minute** between subsequent values and
|
||||
include negative values.
|
||||
|
||||
## Average rate of change per window of time
|
||||
|
||||
Use the [`aggregate.rate()` function](/{{< latest "flux" >}}/stdlib/experimental/aggregate/rate/)
|
||||
to calculate the average rate of change per window of time.
|
||||
|
||||
```js
|
||||
import "experimental/aggregate"
|
||||
|
||||
data
|
||||
|> aggregate.rate(every: 1m, unit: 1s, groupColumns: ["tag1", "tag2"])
|
||||
```
|
||||
|
||||
`aggregate.rate()` returns the average rate of change (as a [float](/{{< latest "flux" >}}/language/types/#numeric-types))
|
||||
per `unit` for time intervals defined by `every`.
|
||||
Negative values are replaced with _null_.
|
||||
|
||||
{{% note %}}
|
||||
`aggregate.rate()` does not support `nonNegative: false`.
|
||||
{{% /note %}}
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 250 |
|
||||
| 2020-01-01T00:04:00Z | 160 |
|
||||
| 2020-01-01T00:12:00Z | 150 |
|
||||
| 2020-01-01T00:19:00Z | 220 |
|
||||
| 2020-01-01T00:32:00Z | 200 |
|
||||
| 2020-01-01T00:51:00Z | 290 |
|
||||
| 2020-01-01T01:00:00Z | 340 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following returns:**
|
||||
|
||||
```js
|
||||
|> aggregate.rate(every: 20m, unit: 1m)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:20:00Z | |
|
||||
| 2020-01-01T00:40:00Z | 10.0 |
|
||||
| 2020-01-01T01:00:00Z | 4.74 |
|
||||
| 2020-01-01T01:20:00Z | 5.56 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
Results represent the **average change rate per minute** of every **20 minute interval**
|
||||
with negative values set to _null_.
|
||||
Timestamps represent the right bound of the time window used to average values.
|
|
@ -0,0 +1,86 @@
|
|||
---
|
||||
title: Use regular expressions in Flux
|
||||
list_title: Regular expressions
|
||||
description: This guide walks through using regular expressions in evaluation logic in Flux functions.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Regular expressions
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
list_query_example: regular_expressions
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/regular-expressions/
|
||||
v2: /influxdb/v2.0/query-data/flux/regular-expressions/
|
||||
---
|
||||
|
||||
Regular expressions (regexes) are incredibly powerful when matching patterns in large collections of data.
|
||||
With Flux, regular expressions are primarily used for evaluation logic in predicate functions for things
|
||||
such as filtering rows, dropping and keeping columns, state detection, etc.
|
||||
This guide shows how to use regular expressions in your Flux scripts.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Go regular expression syntax
|
||||
Flux uses Go's [regexp package](https://golang.org/pkg/regexp/) for regular expression search.
|
||||
The links [below](#helpful-links) provide information about Go's regular expression syntax.
|
||||
|
||||
## Regular expression operators
|
||||
Flux provides two comparison operators for use with regular expressions.
|
||||
|
||||
#### `=~`
|
||||
When the expression on the left **MATCHES** the regular expression on the right, this evaluates to `true`.
|
||||
|
||||
#### `!~`
|
||||
When the expression on the left **DOES NOT MATCH** the regular expression on the right, this evaluates to `true`.
|
||||
|
||||
## Regular expressions in Flux
|
||||
When using regex matching in your Flux scripts, enclose your regular expressions with `/`.
|
||||
The following is the basic regex comparison syntax:
|
||||
|
||||
###### Basic regex comparison syntax
|
||||
```js
|
||||
expression =~ /regex/
|
||||
expression !~ /regex/
|
||||
```
|
||||
## Examples
|
||||
|
||||
### Use a regex to filter by tag value
|
||||
The following example filters records by the `cpu` tag.
|
||||
It only keeps records for which the `cpu` is either `cpu0`, `cpu1`, or `cpu2`.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_user" and r.cpu =~ /cpu[0-2]/)
|
||||
```
|
||||
|
||||
### Use a regex to filter by field key
|
||||
The following example excludes records that do not have `_percent` in a field key.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field =~ /_percent/)
|
||||
```
|
||||
|
||||
### Drop columns matching a regex
|
||||
The following example drops columns whose names do not being with `_`.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "mem")
|
||||
|> drop(fn: (column) => column !~ /_.*/)
|
||||
```
|
||||
|
||||
## Helpful links
|
||||
|
||||
##### Syntax documentation
|
||||
[regexp Syntax GoDoc](https://godoc.org/regexp/syntax)
|
||||
[RE2 Syntax Overview](https://github.com/google/re2/wiki/Syntax)
|
||||
|
||||
##### Go regex testers
|
||||
[Regex Tester - Golang](https://regex-golang.appspot.com/assets/html/index.html)
|
||||
[Regex101](https://regex101.com/)
|
|
@ -0,0 +1,249 @@
|
|||
---
|
||||
title: Extract scalar values in Flux
|
||||
list_title: Extract scalar values
|
||||
description: >
|
||||
Use Flux stream and table functions to extract scalar values from Flux query output.
|
||||
This lets you, for example, dynamically set variables using query results.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Extract scalar values
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/scalar-values/
|
||||
v2: /influxdb/v2.0/query-data/flux/scalar-values/
|
||||
list_code_example: |
|
||||
```js
|
||||
scalarValue = {
|
||||
_record =
|
||||
data
|
||||
|> tableFind(fn: key => true)
|
||||
|> getRecord(idx: 0)
|
||||
return _record._value
|
||||
}
|
||||
```
|
||||
---
|
||||
|
||||
Use Flux [stream and table functions](/{{< latest "flux" >}}/stdlib/universe/stream-table/)
|
||||
to extract scalar values from Flux query output.
|
||||
This lets you, for example, dynamically set variables using query results.
|
||||
|
||||
**To extract scalar values from output:**
|
||||
|
||||
1. [Extract a table](#extract-a-table).
|
||||
2. [Extract a column from the table](#extract-a-column-from-the-table)
|
||||
_**or**_ [extract a row from the table](#extract-a-row-from-the-table).
|
||||
|
||||
_The samples on this page use the [sample data provided below](#sample-data)._
|
||||
|
||||
{{% warn %}}
|
||||
#### Current limitations
|
||||
- The InfluxDB user interface (UI) does not currently support raw scalar output.
|
||||
Use [`map()`](/{{< latest "flux" >}}/stdlib/universe/map/) to add
|
||||
scalar values to output data.
|
||||
- The [Flux REPL](/enterprise_influxdb/v1.10/flux/guides/execute-queries/#influx-cli) does not currently support
|
||||
Flux stream and table functions (also known as "dynamic queries").
|
||||
See [#15321](https://github.com/influxdata/influxdb/issues/15231).
|
||||
{{% /warn %}}
|
||||
|
||||
## Extract a table
|
||||
Flux formats query results as a stream of tables.
|
||||
To extract a scalar value from a stream of tables, you must first extract a single table.
|
||||
|
||||
to extract a single table from the stream of tables.
|
||||
|
||||
{{% note %}}
|
||||
If query results include only one table, it is still formatted as a stream of tables.
|
||||
You still must extract that table from the stream.
|
||||
{{% /note %}}
|
||||
|
||||
Use [`tableFind()`](/{{< latest "flux" >}}/stdlib/universe/stream-table/tablefind/)
|
||||
to extract the **first** table whose [group key](/enterprise_influxdb/v1.10/flux/get-started/#group-keys)
|
||||
values match the `fn` **predicate function**.
|
||||
The predicate function requires a `key` record, which represents the group key of
|
||||
each table.
|
||||
|
||||
```js
|
||||
sampleData
|
||||
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|
||||
```
|
||||
|
||||
The example above returns a single table:
|
||||
|
||||
| _time | location | _field | _value |
|
||||
|:----- |:--------:|:------:| ------:|
|
||||
| 2019-11-01T12:00:00Z | sfo | temp | 65.1 |
|
||||
| 2019-11-01T13:00:00Z | sfo | temp | 66.2 |
|
||||
| 2019-11-01T14:00:00Z | sfo | temp | 66.3 |
|
||||
| 2019-11-01T15:00:00Z | sfo | temp | 66.8 |
|
||||
|
||||
{{% note %}}
|
||||
#### Extract the correct table
|
||||
Flux functions do not guarantee table order and `tableFind()` returns only the
|
||||
**first** table that matches the `fn` predicate.
|
||||
To extract the table that includes the data you actually want, be very specific in
|
||||
your predicate function or filter and transform your data to minimize the number
|
||||
of tables piped-forward into `tableFind()`.
|
||||
{{% /note %}}
|
||||
|
||||
## Extract a column from the table
|
||||
Use the [`getColumn()` function](/{{< latest "flux" >}}/stdlib/universe/stream-table/getcolumn/)
|
||||
to output an array of values from a specific column in the extracted table.
|
||||
|
||||
|
||||
```js
|
||||
sampleData
|
||||
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|
||||
|> getColumn(column: "_value")
|
||||
|
||||
// Returns [65.1, 66.2, 66.3, 66.8]
|
||||
```
|
||||
|
||||
### Use extracted column values
|
||||
Use a variable to store the array of values.
|
||||
In the example below, `SFOTemps` represents the array of values.
|
||||
Reference a specific index (integer starting from `0`) in the array to return the
|
||||
value at that index.
|
||||
|
||||
```js
|
||||
SFOTemps = sampleData
|
||||
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|
||||
|> getColumn(column: "_value")
|
||||
|
||||
SFOTemps
|
||||
// Returns [65.1, 66.2, 66.3, 66.8]
|
||||
|
||||
SFOTemps[0]
|
||||
// Returns 65.1
|
||||
|
||||
SFOTemps[2]
|
||||
// Returns 66.3
|
||||
```
|
||||
|
||||
## Extract a row from the table
|
||||
Use the [`getRecord()` function](/{{< latest "flux" >}}/stdlib/universe/stream-table/getrecord/)
|
||||
to output data from a single row in the extracted table.
|
||||
Specify the index of the row to output using the `idx` parameter.
|
||||
The function outputs a record with key-value pairs for each column.
|
||||
|
||||
```js
|
||||
sampleData
|
||||
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|
||||
|> getRecord(idx: 0)
|
||||
|
||||
// Returns {
|
||||
// _time:2019-11-11T12:00:00Z,
|
||||
// _field:"temp",
|
||||
// location:"sfo",
|
||||
// _value: 65.1
|
||||
// }
|
||||
```
|
||||
|
||||
### Use an extracted row record
|
||||
Use a variable to store the extracted row record.
|
||||
In the example below, `tempInfo` represents the extracted row.
|
||||
Use [dot notation](/enterprise_influxdb/v1.10/flux/get-started/syntax-basics/#records) to reference
|
||||
keys in the record.
|
||||
|
||||
```js
|
||||
tempInfo = sampleData
|
||||
|> tableFind(fn: (key) => key._field == "temp" and key.location == "sfo")
|
||||
|> getRecord(idx: 0)
|
||||
|
||||
tempInfo
|
||||
// Returns {
|
||||
// _time:2019-11-11T12:00:00Z,
|
||||
// _field:"temp",
|
||||
// location:"sfo",
|
||||
// _value: 65.1
|
||||
// }
|
||||
|
||||
tempInfo._time
|
||||
// Returns 2019-11-11T12:00:00Z
|
||||
|
||||
tempInfo.location
|
||||
// Returns sfo
|
||||
```
|
||||
|
||||
## Example helper functions
|
||||
Create custom helper functions to extract scalar values from query output.
|
||||
|
||||
##### Extract a scalar field value
|
||||
```js
|
||||
// Define a helper function to extract field values
|
||||
getFieldValue = (tables=<-, field) => {
|
||||
extract = tables
|
||||
|> tableFind(fn: (key) => key._field == field)
|
||||
|> getColumn(column: "_value")
|
||||
|
||||
return extract[0]
|
||||
}
|
||||
|
||||
// Use the helper function to define a variable
|
||||
lastJFKTemp = sampleData
|
||||
|> filter(fn: (r) => r.location == "kjfk")
|
||||
|> last()
|
||||
|> getFieldValue(field: "temp")
|
||||
|
||||
lastJFKTemp
|
||||
// Returns 71.2
|
||||
```
|
||||
|
||||
##### Extract scalar row data
|
||||
```js
|
||||
// Define a helper function to extract a row as a record
|
||||
getRow = (tables=<-, field, idx=0) => {
|
||||
extract = tables
|
||||
|> tableFind(fn: (key) => true)
|
||||
|> getRecord(idx: idx)
|
||||
|
||||
return extract
|
||||
}
|
||||
|
||||
// Use the helper function to define a variable
|
||||
lastReported = sampleData
|
||||
|> last()
|
||||
|> getRow(field: "temp")
|
||||
|
||||
"The last location to report was ${lastReported.location}.
|
||||
The temperature was ${string(v: lastReported._value)}°F."
|
||||
|
||||
// Returns:
|
||||
// The last location to report was kord.
|
||||
// The temperature was 38.9°F.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sample data
|
||||
|
||||
The following sample data set represents fictional temperature metrics collected
|
||||
from three locations.
|
||||
It's formatted in [annotated CSV](https://v2.docs.influxdata.com/v2.0/reference/syntax/annotated-csv/) and imported
|
||||
into the Flux query using the [`csv.from()` function](/{{< latest "flux" >}}/stdlib/csv/from/).
|
||||
|
||||
Place the following at the beginning of your query to use the sample data:
|
||||
|
||||
{{% truncate %}}
|
||||
```js
|
||||
import "csv"
|
||||
|
||||
sampleData = csv.from(csv: "
|
||||
#datatype,string,long,dateTime:RFC3339,string,string,double
|
||||
#group,false,true,false,true,true,false
|
||||
#default,,,,,,
|
||||
,result,table,_time,location,_field,_value
|
||||
,,0,2019-11-01T12:00:00Z,sfo,temp,65.1
|
||||
,,0,2019-11-01T13:00:00Z,sfo,temp,66.2
|
||||
,,0,2019-11-01T14:00:00Z,sfo,temp,66.3
|
||||
,,0,2019-11-01T15:00:00Z,sfo,temp,66.8
|
||||
,,1,2019-11-01T12:00:00Z,kjfk,temp,69.4
|
||||
,,1,2019-11-01T13:00:00Z,kjfk,temp,69.9
|
||||
,,1,2019-11-01T14:00:00Z,kjfk,temp,71.0
|
||||
,,1,2019-11-01T15:00:00Z,kjfk,temp,71.2
|
||||
,,2,2019-11-01T12:00:00Z,kord,temp,46.4
|
||||
,,2,2019-11-01T13:00:00Z,kord,temp,46.3
|
||||
,,2,2019-11-01T14:00:00Z,kord,temp,42.7
|
||||
,,2,2019-11-01T15:00:00Z,kord,temp,38.9
|
||||
")
|
||||
```
|
||||
{{% /truncate %}}
|
|
@ -0,0 +1,64 @@
|
|||
---
|
||||
title: Sort and limit data with Flux
|
||||
seotitle: Sort and limit data in InfluxDB with Flux
|
||||
list_title: Sort and limit
|
||||
description: >
|
||||
Use the `sort()`function to order records within each table by specific columns and the
|
||||
`limit()` function to limit the number of records in output tables to a fixed number, `n`.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Sort and limit
|
||||
parent: Query with Flux
|
||||
weight: 3
|
||||
list_query_example: sort_limit
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/sort-limit/
|
||||
v2: /influxdb/v2.0/query-data/flux/sort-limit/
|
||||
---
|
||||
|
||||
Use the [`sort()`function](/{{< latest "flux" >}}/stdlib/universe/sort)
|
||||
to order records within each table by specific columns and the
|
||||
[`limit()` function](/{{< latest "flux" >}}/stdlib/universe/limit)
|
||||
to limit the number of records in output tables to a fixed number, `n`.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
##### Example sorting system uptime
|
||||
|
||||
The following example orders system uptime first by region, then host, then value.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -12h)
|
||||
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime")
|
||||
|> sort(columns: ["region", "host", "_value"])
|
||||
```
|
||||
|
||||
The [`limit()` function](/{{< latest "flux" >}}/stdlib/universe/limit)
|
||||
limits the number of records in output tables to a fixed number, `n`.
|
||||
The following example shows up to 10 records from the past hour.
|
||||
|
||||
```js
|
||||
from(bucket:"db/rp")
|
||||
|> range(start:-1h)
|
||||
|> limit(n:10)
|
||||
```
|
||||
|
||||
You can use `sort()` and `limit()` together to show the top N records.
|
||||
The example below returns the 10 top system uptime values sorted first by
|
||||
region, then host, then value.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -12h)
|
||||
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime")
|
||||
|> sort(columns: ["region", "host", "_value"])
|
||||
|> limit(n: 10)
|
||||
```
|
||||
|
||||
You now have created a Flux query that sorts and limits data.
|
||||
Flux also provides the [`top()`](/{{< latest "flux" >}}/stdlib/universe/top)
|
||||
and [`bottom()`](/{{< latest "flux" >}}/stdlib/universe/bottom)
|
||||
functions to perform both of these functions at the same time.
|
|
@ -0,0 +1,215 @@
|
|||
---
|
||||
title: Query SQL data sources
|
||||
seotitle: Query SQL data sources with InfluxDB
|
||||
list_title: Query SQL data
|
||||
description: >
|
||||
The Flux `sql` package provides functions for working with SQL data sources.
|
||||
Use `sql.from()` to query SQL databases like PostgreSQL and MySQL
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Query with Flux
|
||||
list_title: SQL data
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/sql/
|
||||
v2: /influxdb/v2.0/query-data/flux/sql/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "sql"
|
||||
|
||||
sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://user:password@localhost",
|
||||
query: "SELECT * FROM example_table",
|
||||
)
|
||||
```
|
||||
---
|
||||
|
||||
The Flux `sql` package provides functions for working with SQL data sources.
|
||||
[`sql.from()`](/{{< latest "flux" >}}/stdlib/sql/from/) lets you query SQL data sources
|
||||
like [PostgreSQL](https://www.postgresql.org/), [MySQL](https://www.mysql.com/),
|
||||
and [SQLite](https://www.sqlite.org/index.html), and use the results with InfluxDB
|
||||
dashboards, tasks, and other operations.
|
||||
|
||||
- [Query a SQL data source](#query-a-sql-data-source)
|
||||
- [Join SQL data with data in InfluxDB](#join-sql-data-with-data-in-influxdb)
|
||||
- [Sample sensor data](#sample-sensor-data)
|
||||
|
||||
## Query a SQL data source
|
||||
To query a SQL data source:
|
||||
|
||||
1. Import the `sql` package in your Flux query
|
||||
2. Use the `sql.from()` function to specify the driver, data source name (DSN),
|
||||
and query used to query data from your SQL data source:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[PostgreSQL](#)
|
||||
[MySQL](#)
|
||||
[SQLite](#)
|
||||
{{% /code-tabs %}}
|
||||
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
import "sql"
|
||||
|
||||
sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://user:password@localhost",
|
||||
query: "SELECT * FROM example_table",
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
import "sql"
|
||||
|
||||
sql.from(
|
||||
driverName: "mysql",
|
||||
dataSourceName: "user:password@tcp(localhost:3306)/db",
|
||||
query: "SELECT * FROM example_table",
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
// NOTE: InfluxDB OSS and InfluxDB Cloud do not have access to
|
||||
// the local filesystem and cannot query SQLite data sources.
|
||||
// Use the Flux REPL to query an SQLite data source.
|
||||
|
||||
import "sql"
|
||||
sql.from(
|
||||
driverName: "sqlite3",
|
||||
dataSourceName: "file:/path/to/test.db?cache=shared&mode=ro",
|
||||
query: "SELECT * FROM example_table",
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
_See the [`sql.from()` documentation](/{{< latest "flux" >}}/stdlib/sql/from/) for
|
||||
information about required function parameters._
|
||||
|
||||
## Join SQL data with data in InfluxDB
|
||||
One of the primary benefits of querying SQL data sources from InfluxDB
|
||||
is the ability to enrich query results with data stored outside of InfluxDB.
|
||||
|
||||
Using the [air sensor sample data](#sample-sensor-data) below, the following query
|
||||
joins air sensor metrics stored in InfluxDB with sensor information stored in PostgreSQL.
|
||||
The joined data lets you query and filter results based on sensor information
|
||||
that isn't stored in InfluxDB.
|
||||
|
||||
```js
|
||||
// Import the "sql" package
|
||||
import "sql"
|
||||
|
||||
// Query data from PostgreSQL
|
||||
sensorInfo = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://localhost?sslmode=disable",
|
||||
query: "SELECT * FROM sensors",
|
||||
)
|
||||
|
||||
// Query data from InfluxDB
|
||||
sensorMetrics = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "airSensors")
|
||||
|
||||
// Join InfluxDB query results with PostgreSQL query results
|
||||
join(tables: {metric: sensorMetrics, info: sensorInfo}, on: ["sensor_id"])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sample sensor data
|
||||
The [sample data generator](#download-and-run-the-sample-data-generator) and
|
||||
[sample sensor information](#import-the-sample-sensor-information) simulate a
|
||||
group of sensors that measure temperature, humidity, and carbon monoxide
|
||||
in rooms throughout a building.
|
||||
Each collected data point is stored in InfluxDB with a `sensor_id` tag that identifies
|
||||
the specific sensor it came from.
|
||||
Sample sensor information is stored in PostgreSQL.
|
||||
|
||||
**Sample data includes:**
|
||||
|
||||
- Simulated data collected from each sensor and stored in the `airSensors` measurement in **InfluxDB**:
|
||||
- temperature
|
||||
- humidity
|
||||
- co
|
||||
|
||||
- Information about each sensor stored in the `sensors` table in **PostgreSQL**:
|
||||
- sensor_id
|
||||
- location
|
||||
- model_number
|
||||
- last_inspected
|
||||
|
||||
### Import and generate sample sensor data
|
||||
|
||||
#### Download and run the sample data generator
|
||||
`air-sensor-data.rb` is a script that generates air sensor data and stores the data in InfluxDB.
|
||||
To use `air-sensor-data.rb`:
|
||||
|
||||
1. [Create a database](/enterprise_influxdb/v1.10/introduction/get-started/#creating-a-database) to store the data.
|
||||
2. Download the sample data generator. _This tool requires [Ruby](https://www.ruby-lang.org/en/)._
|
||||
|
||||
<a class="btn download" style="color:#fff" href="/downloads/air-sensor-data.rb" download>Download Air Sensor Generator</a>
|
||||
|
||||
3. Give `air-sensor-data.rb` executable permissions:
|
||||
|
||||
```
|
||||
chmod +x air-sensor-data.rb
|
||||
```
|
||||
|
||||
4. Start the generator. Specify your database.
|
||||
|
||||
```
|
||||
./air-sensor-data.rb -d database-name
|
||||
```
|
||||
|
||||
The generator begins to write data to InfluxDB and will continue until stopped.
|
||||
Use `ctrl-c` to stop the generator.
|
||||
|
||||
_**Note:** Use the `--help` flag to view other configuration options._
|
||||
|
||||
|
||||
5. Query your target database to ensure the generated data is writing successfully.
|
||||
The generator doesn't catch errors from write requests, so it will continue running
|
||||
even if data is not writing to InfluxDB successfully.
|
||||
|
||||
```
|
||||
from(bucket: "database-name/autogen")
|
||||
|> range(start: -1m)
|
||||
|> filter(fn: (r) => r._measurement == "airSensors")
|
||||
```
|
||||
|
||||
#### Import the sample sensor information
|
||||
1. [Download and install PostgreSQL](https://www.postgresql.org/download/).
|
||||
2. Download the sample sensor information CSV.
|
||||
|
||||
<a class="btn download" style="color:#fff" href="/downloads/sample-sensor-info.csv" download>Download Sample Data</a>
|
||||
|
||||
3. Use a PostgreSQL client (`psql` or a GUI) to create the `sensors` table:
|
||||
|
||||
```
|
||||
CREATE TABLE sensors (
|
||||
sensor_id character varying(50),
|
||||
location character varying(50),
|
||||
model_number character varying(50),
|
||||
last_inspected date
|
||||
);
|
||||
```
|
||||
|
||||
4. Import the downloaded CSV sample data.
|
||||
_Update the `FROM` file path to the path of the downloaded CSV sample data._
|
||||
|
||||
```
|
||||
COPY sensors(sensor_id,location,model_number,last_inspected)
|
||||
FROM '/path/to/sample-sensor-info.csv' DELIMITER ',' CSV HEADER;
|
||||
```
|
||||
|
||||
5. Query the table to ensure the data was imported correctly:
|
||||
|
||||
```
|
||||
SELECT * FROM sensors;
|
||||
```
|
|
@ -0,0 +1,351 @@
|
|||
---
|
||||
title: Window and aggregate data with Flux
|
||||
seotitle: Window and aggregate data in InfluxDB with Flux
|
||||
list_title: Window & aggregate
|
||||
description: >
|
||||
This guide walks through windowing and aggregating data with Flux and outlines
|
||||
how it shapes your data in the process.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Window & aggregate
|
||||
parent: Query with Flux
|
||||
weight: 4
|
||||
list_query_example: aggregate_window
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/window-aggregate/
|
||||
v2: /influxdb/v2.0/query-data/flux/window-aggregate/
|
||||
---
|
||||
|
||||
A common operation performed with time series data is grouping data into windows of time,
|
||||
or "windowing" data, then aggregating windowed values into a new value.
|
||||
This guide walks through windowing and aggregating data with Flux and demonstrates
|
||||
how data is shaped in the process.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.10/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.10/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
{{% note %}}
|
||||
The following example is an in-depth walk-through of the steps required to window and aggregate data.
|
||||
The [`aggregateWindow()` function](#summing-up) performs these operations for you, but understanding
|
||||
how data is shaped in the process helps to successfully create your desired output.
|
||||
{{% /note %}}
|
||||
|
||||
## Data set
|
||||
For the purposes of this guide, define a variable that represents your base data set.
|
||||
The following example queries the memory usage of the host machine.
|
||||
|
||||
```js
|
||||
dataSet = from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
|> drop(columns: ["host"])
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
This example drops the `host` column from the returned data since the memory data
|
||||
is only tracked for a single host and it simplifies the output tables.
|
||||
Dropping the `host` column is optional and not recommended if monitoring memory
|
||||
on multiple hosts.
|
||||
{{% /note %}}
|
||||
|
||||
`dataSet` can now be used to represent your base data, which will look similar to the following:
|
||||
|
||||
{{% truncate %}}
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:00.000000000Z 71.11611366271973
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:10.000000000Z 67.39630699157715
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:20.000000000Z 64.16666507720947
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:30.000000000Z 64.19951915740967
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:40.000000000Z 64.2122745513916
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:50.000000000Z 64.22209739685059
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 64.6336555480957
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:10.000000000Z 64.16516304016113
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:20.000000000Z 64.18349742889404
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:30.000000000Z 64.20474052429199
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:40.000000000Z 68.65062713623047
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:50.000000000Z 67.20139980316162
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 70.9143877029419
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:10.000000000Z 64.14549350738525
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:20.000000000Z 64.15379047393799
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:30.000000000Z 64.1592264175415
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:40.000000000Z 64.18190002441406
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:50.000000000Z 64.28837776184082
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 64.29731845855713
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:10.000000000Z 64.36963081359863
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:20.000000000Z 64.37397003173828
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:30.000000000Z 64.44413661956787
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:40.000000000Z 64.42906856536865
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:50.000000000Z 64.44573402404785
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.48912620544434
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:10.000000000Z 64.49522972106934
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:20.000000000Z 64.48652744293213
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:30.000000000Z 64.49949741363525
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:40.000000000Z 64.4949197769165
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:50.000000000Z 64.49787616729736
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
## Windowing data
|
||||
Use the [`window()` function](/{{< latest "flux" >}}/stdlib/universe/window)
|
||||
to group your data based on time bounds.
|
||||
The most common parameter passed with the `window()` is `every` which
|
||||
defines the duration of time between windows.
|
||||
Other parameters are available, but for this example, window the base data
|
||||
set into one minute windows.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
The `every` parameter supports all [valid duration units](/{{< latest "flux" >}}/language/types/#duration-types),
|
||||
including **calendar months (`1mo`)** and **years (`1y`)**.
|
||||
{{% /note %}}
|
||||
|
||||
Each window of time is output in its own table containing all records that fall within the window.
|
||||
|
||||
{{% truncate %}}
|
||||
###### window() output tables
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:00.000000000Z 71.11611366271973
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:10.000000000Z 67.39630699157715
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:20.000000000Z 64.16666507720947
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:30.000000000Z 64.19951915740967
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:40.000000000Z 64.2122745513916
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:50.000000000Z 64.22209739685059
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 64.6336555480957
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:10.000000000Z 64.16516304016113
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:20.000000000Z 64.18349742889404
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:30.000000000Z 64.20474052429199
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:40.000000000Z 68.65062713623047
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:50.000000000Z 67.20139980316162
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 70.9143877029419
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:10.000000000Z 64.14549350738525
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:20.000000000Z 64.15379047393799
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:30.000000000Z 64.1592264175415
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:40.000000000Z 64.18190002441406
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:50.000000000Z 64.28837776184082
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 64.29731845855713
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:10.000000000Z 64.36963081359863
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:20.000000000Z 64.37397003173828
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:30.000000000Z 64.44413661956787
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:40.000000000Z 64.42906856536865
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:50.000000000Z 64.44573402404785
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.48912620544434
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:10.000000000Z 64.49522972106934
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:20.000000000Z 64.48652744293213
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:30.000000000Z 64.49949741363525
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:40.000000000Z 64.4949197769165
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:50.000000000Z 64.49787616729736
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
When visualized in the InfluxDB UI, each window table is displayed in a different color.
|
||||
|
||||
![Windowed data](/img/flux/simple-windowed-data.png)
|
||||
|
||||
## Aggregate data
|
||||
[Aggregate functions](/{{< latest "flux" >}}/stdlib/universe) take the values
|
||||
of all rows in a table and use them to perform an aggregate operation.
|
||||
The result is output as a new value in a single-row table.
|
||||
|
||||
Since windowed data is split into separate tables, aggregate operations run against
|
||||
each table separately and output new tables containing only the aggregated value.
|
||||
|
||||
For this example, use the [`mean()` function](/{{< latest "flux" >}}/stdlib/universe/mean)
|
||||
to output the average of each window:
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
|> mean()
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### mean() output tables
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 65.88549613952637
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 65.50651391347249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 65.30719598134358
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 64.39330975214641
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49386278788249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
Because each data point is contained in its own table, when visualized,
|
||||
they appear as single, unconnected points.
|
||||
|
||||
![Aggregated windowed data](/img/flux/simple-windowed-aggregate-data.png)
|
||||
|
||||
### Recreate the time column
|
||||
**Notice the `_time` column is not in the [aggregated output tables](#mean-output-tables).**
|
||||
Because records in each table are aggregated together, their timestamps no longer
|
||||
apply and the column is removed from the group key and table.
|
||||
|
||||
Also notice the `_start` and `_stop` columns still exist.
|
||||
These represent the lower and upper bounds of the time window.
|
||||
|
||||
Many Flux functions rely on the `_time` column.
|
||||
To further process your data after an aggregate function, you need to re-add `_time`.
|
||||
Use the [`duplicate()` function](/{{< latest "flux" >}}/stdlib/universe/duplicate) to
|
||||
duplicate either the `_start` or `_stop` column as a new `_time` column.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### duplicate() output tables
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 65.88549613952637
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 65.50651391347249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 65.30719598134358
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.39330975214641
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49386278788249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
## "Unwindow" aggregate tables
|
||||
Keeping aggregate values in separate tables generally isn't the format in which you want your data.
|
||||
Use the `window()` function to "unwindow" your data into a single infinite (`inf`) window.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
|> window(every: inf)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Windowing requires a `_time` column which is why it's necessary to
|
||||
[recreate the `_time` column](#recreate-the-time-column) after an aggregation.
|
||||
{{% /note %}}
|
||||
|
||||
###### Unwindowed output table
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 65.88549613952637
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 65.50651391347249
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 65.30719598134358
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.39330975214641
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49386278788249
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
|
||||
With the aggregate values in a single table, data points in the visualization are connected.
|
||||
|
||||
![Unwindowed aggregate data](/img/flux/simple-unwindowed-data.png)
|
||||
|
||||
## Summing up
|
||||
You have now created a Flux query that windows and aggregates data.
|
||||
The data transformation process outlined in this guide should be used for all aggregation operations.
|
||||
|
||||
Flux also provides the [`aggregateWindow()` function](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow)
|
||||
which performs all these separate functions for you.
|
||||
|
||||
The following Flux query will return the same results:
|
||||
|
||||
###### aggregateWindow function
|
||||
```js
|
||||
dataSet
|
||||
|> aggregateWindow(every: 1m, fn: mean)
|
||||
```
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
title: Enable Flux
|
||||
description: Instructions for enabling Flux in your InfluxDB configuration.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Enable Flux
|
||||
parent: Flux
|
||||
weight: 1
|
||||
---
|
||||
|
||||
Flux is packaged with **InfluxDB v1.8+** and does not require any additional installation,
|
||||
however it is **disabled by default and needs to be enabled**.
|
||||
|
||||
## Enable Flux
|
||||
Enable Flux by setting the `flux-enabled` option to `true` under the `[http]` section of your `influxdb.conf`:
|
||||
|
||||
###### influxdb.conf
|
||||
```toml
|
||||
# ...
|
||||
|
||||
[http]
|
||||
|
||||
# ...
|
||||
|
||||
flux-enabled = true
|
||||
|
||||
# ...
|
||||
```
|
||||
|
||||
> The default location of your `influxdb.conf` depends on your operating system.
|
||||
> More information is available in the [Configuring InfluxDB](/enterprise_influxdb/v1.10/administration/config/#using-the-configuration-file) guide.
|
||||
|
||||
When InfluxDB starts, the Flux daemon starts as well and data can be queried using Flux.
|
|
@ -0,0 +1,180 @@
|
|||
---
|
||||
title: Optimize Flux queries
|
||||
description: >
|
||||
Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
|
||||
weight: 4
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
parent: Flux
|
||||
canonical: /influxdb/cloud/query-data/optimize-queries/
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/flux/guides/optimize-queries
|
||||
---
|
||||
|
||||
Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
|
||||
|
||||
- [Start queries with pushdowns](#start-queries-with-pushdowns)
|
||||
- [Avoid processing filters inline](#avoid-processing-filters-inline)
|
||||
- [Avoid short window durations](#avoid-short-window-durations)
|
||||
- [Use "heavy" functions sparingly](#use-heavy-functions-sparingly)
|
||||
- [Use set() instead of map() when possible](#use-set-instead-of-map-when-possible)
|
||||
- [Balance time range and data precision](#balance-time-range-and-data-precision)
|
||||
- [Measure query performance with Flux profilers](#measure-query-performance-with-flux-profilers)
|
||||
|
||||
## Start queries with pushdowns
|
||||
**Pushdowns** are functions or function combinations that push data operations to the underlying data source rather than operating on data in memory. Start queries with pushdowns to improve query performance. Once a non-pushdown function runs, Flux pulls data into memory and runs all subsequent operations there.
|
||||
|
||||
#### Pushdown functions and function combinations
|
||||
The following pushdowns are supported in InfluxDB Enterprise 1.10+.
|
||||
|
||||
| Functions | Supported |
|
||||
| :----------------------------- | :------------------: |
|
||||
| **count()** | {{< icon "check" "v2" >}} |
|
||||
| **drop()** | {{< icon "check" "v2" >}} |
|
||||
| **duplicate()** | {{< icon "check" "v2" >}} |
|
||||
| **filter()** {{% req " \*" %}} | {{< icon "check" "v2" >}} |
|
||||
| **fill()** | {{< icon "check" "v2" >}} |
|
||||
| **first()** | {{< icon "check" "v2" >}} |
|
||||
| **group()** | {{< icon "check" "v2" >}} |
|
||||
| **keep()** | {{< icon "check" "v2" >}} |
|
||||
| **last()** | {{< icon "check" "v2" >}} |
|
||||
| **max()** | {{< icon "check" "v2" >}} |
|
||||
| **mean()** | {{< icon "check" "v2" >}} |
|
||||
| **min()** | {{< icon "check" "v2" >}} |
|
||||
| **range()** | {{< icon "check" "v2" >}} |
|
||||
| **rename()** | {{< icon "check" "v2" >}} |
|
||||
| **sum()** | {{< icon "check" "v2" >}} |
|
||||
| **window()** | {{< icon "check" "v2" >}} |
|
||||
| _Function combinations_ | |
|
||||
| **window()** \|> **count()** | {{< icon "check" "v2" >}} |
|
||||
| **window()** \|> **first()** | {{< icon "check" "v2" >}} |
|
||||
| **window()** \|> **last()** | {{< icon "check" "v2" >}} |
|
||||
| **window()** \|> **max()** | {{< icon "check" "v2" >}} |
|
||||
| **window()** \|> **min()** | {{< icon "check" "v2" >}} |
|
||||
| **window()** \|> **sum()** | {{< icon "check" "v2" >}} |
|
||||
|
||||
{{% caption %}}
|
||||
{{< req "\*" >}} **filter()** only pushes down when all parameter values are static.
|
||||
See [Avoid processing filters inline](#avoid-processing-filters-inline).
|
||||
{{% /caption %}}
|
||||
|
||||
Use pushdown functions and function combinations at the beginning of your query.
|
||||
Once a non-pushdown function runs, Flux pulls data into memory and runs all
|
||||
subsequent operations there.
|
||||
|
||||
##### Pushdown functions in use
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h) //
|
||||
|> filter(fn: (r) => r.sensor == "abc123") //
|
||||
|> group(columns: ["_field", "host"]) // Pushed to the data source
|
||||
|> aggregateWindow(every: 5m, fn: max) //
|
||||
|> filter(fn: (r) => r._value >= 90.0) //
|
||||
|
||||
|> top(n: 10) // Run in memory
|
||||
```
|
||||
|
||||
### Avoid processing filters inline
|
||||
Avoid using mathematic operations or string manipulation inline to define data filters.
|
||||
Processing filter values inline prevents `filter()` from pushing its operation down
|
||||
to the underlying data source, so data returned by the
|
||||
previous function loads into memory.
|
||||
This often results in a significant performance hit.
|
||||
|
||||
For example, the following query uses [Chronograf dashboard template variables](/{{< latest "chronograf" >}}/guides/dashboard-template-variables/)
|
||||
and string concatenation to define a region to filter by.
|
||||
Because `filter()` uses string concatenation inline, it can't push its operation
|
||||
to the underlying data source and loads all data returned from `range()` into memory.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r.region == v.provider + v.region)
|
||||
```
|
||||
|
||||
To dynamically set filters and maintain the pushdown ability of the `filter()` function,
|
||||
use variables to define filter values outside of `filter()`:
|
||||
|
||||
```js
|
||||
region = v.provider + v.region
|
||||
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r.region == region)
|
||||
```
|
||||
|
||||
## Avoid short window durations
|
||||
Windowing (grouping data based on time intervals) is commonly used to aggregate and downsample data.
|
||||
Increase performance by avoiding short window durations.
|
||||
More windows require more compute power to evaluate which window each row should be assigned to.
|
||||
Reasonable window durations depend on the total time range queried.
|
||||
|
||||
## Use "heavy" functions sparingly
|
||||
The following functions use more memory or CPU than others.
|
||||
Consider their necessity in your data processing before using them:
|
||||
|
||||
- [map()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/map/)
|
||||
- [reduce()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/reduce/)
|
||||
- [join()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/join/)
|
||||
- [union()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/union/)
|
||||
- [pivot()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/pivot/)
|
||||
|
||||
## Use set() instead of map() when possible
|
||||
[`set()`](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/set/),
|
||||
[`experimental.set()`](/influxdb/v2.0/reference/flux/stdlib/experimental/set/),
|
||||
and [`map`](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/map/)
|
||||
can each set columns value in data, however **set** functions have performance
|
||||
advantages over `map()`.
|
||||
|
||||
Use the following guidelines to determine which to use:
|
||||
|
||||
- If setting a column value to a predefined, static value, use `set()` or `experimental.set()`.
|
||||
- If dynamically setting a column value using **existing row data**, use `map()`.
|
||||
|
||||
#### Set a column value to a static value
|
||||
The following queries are functionally the same, but using `set()` is more performant than using `map()`.
|
||||
|
||||
```js
|
||||
data
|
||||
|> map(fn: (r) => ({ r with foo: "bar" }))
|
||||
|
||||
// Recommended
|
||||
data
|
||||
|> set(key: "foo", value: "bar")
|
||||
```
|
||||
|
||||
#### Dynamically set a column value using existing row data
|
||||
```js
|
||||
data
|
||||
|> map(fn: (r) => ({ r with foo: r.bar }))
|
||||
```
|
||||
|
||||
## Balance time range and data precision
|
||||
To ensure queries are performant, balance the time range and the precision of your data.
|
||||
For example, if you query data stored every second and request six months worth of data,
|
||||
results would include ≈15.5 million points per series.
|
||||
Depending on the number of series returned after `filter()`([cardinality](/enterprise_influxdb/v1.10/concepts/glossary/#series-cardinality)),
|
||||
this can quickly become many billions of points.
|
||||
Flux must store these points in memory to generate a response.
|
||||
Use [pushdowns](#pushdown-functions-and-function-combinations) to optimize how
|
||||
many points are stored in memory.
|
||||
|
||||
## Measure query performance with Flux profilers
|
||||
Use the [Flux Profiler package](/influxdb/v2.0/reference/flux/stdlib/profiler/)
|
||||
to measure query performance and append performance metrics to your query output.
|
||||
The following Flux profilers are available:
|
||||
|
||||
- **query**: provides statistics about the execution of an entire Flux script.
|
||||
- **operator**: provides statistics about each operation in a query.
|
||||
|
||||
Import the `profiler` package and enable profilers with the `profile.enabledProfilers` option.
|
||||
|
||||
```js
|
||||
import "profiler"
|
||||
|
||||
option profiler.enabledProfilers = ["query", "operator"]
|
||||
|
||||
// Query to profile
|
||||
```
|
||||
|
||||
For more information about Flux profilers, see the [Flux Profiler package](/influxdb/v2.0/reference/flux/stdlib/profiler/).
|
|
@ -0,0 +1,12 @@
|
|||
---
|
||||
title: InfluxDB Enterprise guides
|
||||
description: Step-by-step guides for using InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/guides/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Guides
|
||||
weight: 60
|
||||
---
|
||||
|
||||
{{< children hlevel="h2" >}}
|
|
@ -0,0 +1,191 @@
|
|||
---
|
||||
title: Authenticate requests to InfluxDB Enterprise
|
||||
description: >
|
||||
Calculate percentages using basic math operators available in InfluxQL or Flux.
|
||||
This guide walks through use cases and examples of calculating percentages from two values in a single query.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 25
|
||||
parent: Guides
|
||||
name: Authenticate requests
|
||||
---
|
||||
|
||||
_To require valid credentials for cluster access, see ["Enable authentication"](/enterprise_influxdb/v1.10/administration/configure/security/authentication/)._
|
||||
|
||||
## Authenticate requests
|
||||
|
||||
### Authenticate with the InfluxDB API
|
||||
|
||||
Authenticate with the [InfluxDB API](/enterprise_influxdb/v1.10/tools/api/) using one of the following options:
|
||||
|
||||
- [Authenticate with basic authentication](#authenticate-with-basic-authentication)
|
||||
- [Authenticate with query parameters in the URL or request body](#authenticate-with-query-parameters-in-the-url-or-request-body)
|
||||
|
||||
If you authenticate with both basic authentication **and** the URL query parameters,
|
||||
the user credentials specified in the query parameters take precedence.
|
||||
The following examples demonstrate queries with [admin user](#admin-users) permissions.
|
||||
To learn about different users types, permissions, and how to manage users, see [authorization](#authorization).
|
||||
|
||||
{{% note %}}
|
||||
InfluxDB Enterprise redacts passwords in log output when you enable authentication.
|
||||
{{% /note %}}
|
||||
|
||||
#### Authenticate with basic authentication
|
||||
```bash
|
||||
curl -G http://localhost:8086/query \
|
||||
-u todd:password4todd \
|
||||
--data-urlencode "q=SHOW DATABASES"
|
||||
```
|
||||
|
||||
#### Authenticate with query parameters in the URL or request body
|
||||
Set `u` as the username and `p` as the password.
|
||||
|
||||
##### Credentials as query parameters
|
||||
```bash
|
||||
curl -G "http://localhost:8086/query?u=todd&p=password4todd" \
|
||||
--data-urlencode "q=SHOW DATABASES"
|
||||
```
|
||||
|
||||
##### Credentials in the request body
|
||||
```bash
|
||||
curl -G http://localhost:8086/query \
|
||||
--data-urlencode "u=todd" \
|
||||
--data-urlencode "p=password4todd" \
|
||||
--data-urlencode "q=SHOW DATABASES"
|
||||
```
|
||||
|
||||
### Authenticate with the CLI
|
||||
|
||||
There are three options for authenticating with the [CLI](/enterprise_influxdb/v1.10/tools/influx-cli/):
|
||||
|
||||
- [Authenticate with environment variables](#authenticate-with-environment-variables)
|
||||
- [Authenticate with CLI flags](#authenticate-with-cli-flags)
|
||||
- [Authenticate with credentials in the influx shell](#authenticate-with-credentials-in-the-influx-shell)
|
||||
|
||||
#### Authenticate with environment variables
|
||||
Use the `INFLUX_USERNAME` and `INFLUX_PASSWORD` environment variables to provide
|
||||
authentication credentials to the `influx` CLI.
|
||||
|
||||
```bash
|
||||
export INFLUX_USERNAME=todd
|
||||
export INFLUX_PASSWORD=password4todd
|
||||
echo $INFLUX_USERNAME $INFLUX_PASSWORD
|
||||
todd password4todd
|
||||
|
||||
influx
|
||||
Connected to http://localhost:8086 version {{< latest-patch >}}
|
||||
InfluxDB shell {{< latest-patch >}}
|
||||
```
|
||||
|
||||
#### Authenticate with CLI flags
|
||||
Use the `-username` and `-password` flags to provide authentication credentials
|
||||
to the `influx` CLI.
|
||||
|
||||
```bash
|
||||
influx -username todd -password password4todd
|
||||
Connected to http://localhost:8086 version {{< latest-patch >}}
|
||||
InfluxDB shell {{< latest-patch >}}
|
||||
```
|
||||
|
||||
#### Authenticate with credentials in the influx shell
|
||||
Start the `influx` shell and run the `auth` command.
|
||||
Enter your username and password when prompted.
|
||||
|
||||
```bash
|
||||
$ influx
|
||||
Connected to http://localhost:8086 version {{< latest-patch >}}
|
||||
InfluxDB shell {{< latest-patch >}}
|
||||
> auth
|
||||
username: todd
|
||||
password:
|
||||
>
|
||||
```
|
||||
|
||||
### Authenticate using JWT tokens
|
||||
For a more secure alternative to using passwords, include JWT tokens with requests to the InfluxDB API.
|
||||
This is currently only possible through the [InfluxDB HTTP API](/enterprise_influxdb/v1.10/tools/api/).
|
||||
|
||||
1. **Add a shared secret in your InfluxDB Enterprise configuration file**.
|
||||
|
||||
InfluxDB Enterprise uses the shared secret to encode the JWT signature.
|
||||
By default, `shared-secret` is set to an empty string, in which case no JWT authentication takes place.
|
||||
<!-- TODO: meta, data, or both? -->
|
||||
Add a custom shared secret in your [InfluxDB configuration file](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#shared-secret).
|
||||
The longer the secret string, the more secure it is:
|
||||
|
||||
```toml
|
||||
[http]
|
||||
shared-secret = "my super secret pass phrase"
|
||||
```
|
||||
|
||||
Alternatively, to avoid keeping your secret phrase as plain text in your InfluxDB configuration file,
|
||||
set the value with the `INFLUXDB_HTTP_SHARED_SECRET` environment variable.
|
||||
|
||||
2. **Generate your JWT token**.
|
||||
|
||||
Use an authentication service to generate a secure token
|
||||
using your InfluxDB username, an expiration time, and your shared secret.
|
||||
There are online tools, such as [https://jwt.io/](https://jwt.io/), that will do this for you.
|
||||
|
||||
The payload (or claims) of the token must be in the following format:
|
||||
|
||||
```json
|
||||
{
|
||||
"username": "myUserName",
|
||||
"exp": 1516239022
|
||||
}
|
||||
```
|
||||
|
||||
- **username** - The name of your InfluxDB user.
|
||||
- **exp** - The expiration time of the token in UNIX epoch time.
|
||||
For increased security, keep token expiration periods short.
|
||||
For testing, you can manually generate UNIX timestamps using [https://www.unixtimestamp.com/index.php](https://www.unixtimestamp.com/index.php).
|
||||
|
||||
Encode the payload using your shared secret.
|
||||
You can do this with either a JWT library in your own authentication server or by hand at [https://jwt.io/](https://jwt.io/).
|
||||
|
||||
The generated token follows this format: `<header>.<payload>.<signature>`
|
||||
|
||||
3. **Include the token in HTTP requests**.
|
||||
|
||||
Include your generated token as part of the `Authorization` header in HTTP requests:
|
||||
|
||||
```
|
||||
Authorization: Bearer <myToken>
|
||||
```
|
||||
{{% note %}}
|
||||
Only unexpired tokens will successfully authenticate.
|
||||
Be sure your token has not expired.
|
||||
{{% /note %}}
|
||||
|
||||
#### Example query request with JWT authentication
|
||||
```bash
|
||||
curl -G "http://localhost:8086/query?db=demodb" \
|
||||
--data-urlencode "q=SHOW DATABASES" \
|
||||
--header "Authorization: Bearer <header>.<payload>.<signature>"
|
||||
```
|
||||
|
||||
## Authenticate Telegraf requests to InfluxDB
|
||||
|
||||
Authenticating [Telegraf](/{{< latest "telegraf" >}}/) requests to an InfluxDB instance with
|
||||
authentication enabled requires some additional steps.
|
||||
In the Telegraf configuration file (`/etc/telegraf/telegraf.conf`), uncomment
|
||||
and edit the `username` and `password` settings.
|
||||
|
||||
```toml
|
||||
###############################################################################
|
||||
# OUTPUT PLUGINS #
|
||||
###############################################################################
|
||||
|
||||
# ...
|
||||
|
||||
[[outputs.influxdb]]
|
||||
# ...
|
||||
username = "example-username" # Provide your username
|
||||
password = "example-password" # Provide your password
|
||||
|
||||
# ...
|
||||
```
|
||||
|
||||
Restart Telegraf and you're all set!
|
||||
|
|
@ -0,0 +1,274 @@
|
|||
---
|
||||
title: Calculate percentages in a query
|
||||
description: >
|
||||
Calculate percentages using basic math operators available in InfluxQL or Flux.
|
||||
This guide walks through use-cases and examples of calculating percentages from two values in a single query.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 50
|
||||
parent: Guides
|
||||
name: Calculate percentages
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/calculating_percentages/
|
||||
v2: /influxdb/v2.0/query-data/flux/calculate-percentages/
|
||||
---
|
||||
|
||||
Use Flux or InfluxQL to calculate percentages in a query.
|
||||
|
||||
{{< tabs-wrapper >}}
|
||||
{{% tabs %}}
|
||||
[Flux](#)
|
||||
[InfluxQL](#)
|
||||
{{% /tabs %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
|
||||
[Flux](/flux/latest/) lets you perform simple math equations, for example, calculating a percentage.
|
||||
|
||||
## Calculate a percentage
|
||||
|
||||
Learn how to calculate a percentage using the following examples:
|
||||
|
||||
- [Basic calculations within a query](#basic-calculations-within-a-query)
|
||||
- [Calculate a percentage from two fields](#calculate-a-percentage-from-two-fields)
|
||||
- [Calculate a percentage using aggregate functions](#calculate-a-percentage-using-aggregate-functions)
|
||||
- [Calculate the percentage of total weight per apple variety](#calculate-the-percentage-of-total-weight-per-apple-variety)
|
||||
- [Calculate the aggregate percentage per variety](#calculate-the-percentage-of-total-weight-per-apple-variety)
|
||||
|
||||
## Basic calculations within a query
|
||||
|
||||
When performing any math operation in a Flux query, you must complete the following steps:
|
||||
|
||||
1. Specify the [bucket](/{{< latest "influxdb" "v2" >}}/query-data/get-started/#buckets) to query from and the time range to query.
|
||||
2. Filter your data by measurements, fields, and other applicable criteria.
|
||||
3. Align values in one row (required to perform math in Flux) by using one of the following functions:
|
||||
- To query **from multiple** data sources, use the [`join()` function](/{{< latest "flux" >}}/stdlib/universe/join/).
|
||||
- To query **from the same** data source, use the [`pivot()` function](/{{< latest "flux" >}}/stdlib/universe/pivot/).
|
||||
|
||||
For examples using the `join()` function to calculate percentages and more examples of calculating percentages, see [Calculate percentages with Flux](/{{< latest "influxdb" "v2" >}}/query-data/flux/calculate-percentages/).
|
||||
|
||||
#### Data variable
|
||||
|
||||
To shorten examples, we'll store a basic Flux query in a `data` variable for reuse.
|
||||
|
||||
Here's how that looks in Flux:
|
||||
|
||||
```js
|
||||
// Query data from the past 15 minutes pivot fields into columns so each row
|
||||
// contains values for each field
|
||||
data = from(bucket: "your_db/your_retention_policy")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /field[1-2]/)
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
```
|
||||
|
||||
Each row now contains the values necessary to perform a math operation. For example, to add two field keys, start with the `data` variable created above, and then use `map()` to re-map values in each row.
|
||||
|
||||
```js
|
||||
data
|
||||
|> map(fn: (r) => ({ r with _value: r.field1 + r.field2}))
|
||||
```
|
||||
|
||||
> **Note:** Flux supports basic math operators such as `+`,`-`,`/`, `*`, and `()`. For example, to subtract `field2` from `field1`, change `+` to `-`.
|
||||
|
||||
## Calculate a percentage from two fields
|
||||
|
||||
Use the `data` variable created above, and then use the [`map()` function](/{{< latest "flux" >}}/stdlib/universe/map/) to divide one field by another, multiply by 100, and add a new `percent` field to store the percentage values in.
|
||||
|
||||
```js
|
||||
data
|
||||
|> map(
|
||||
fn: (r) => ({
|
||||
_time: r._time,
|
||||
_measurement: r._measurement,
|
||||
_field: "percent",
|
||||
_value: field1 / field2 * 100.0
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
>**Note:** In this example, `field1` and `field2` are float values, hence multiplied by 100.0. For integer values, multiply by 100 or use the `float()` function to cast integers to floats.
|
||||
|
||||
## Calculate a percentage using aggregate functions
|
||||
|
||||
Use [`aggregateWindow()`](/{{< latest "flux" >}}/stdlib/universe/aggregatewindow) to window data by time and perform an aggregate function on each window.
|
||||
|
||||
```js
|
||||
from(bucket: "<database>/<retention_policy>")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /fieldkey[1-2]/)
|
||||
|> aggregateWindow(every: 1m, fn: sum)
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({r with _value: r.field1 / r.field2 * 100.0}))
|
||||
```
|
||||
|
||||
## Calculate the percentage of total weight per apple variety
|
||||
|
||||
Use simulated apple stand data to track the weight of apples (by type) throughout a day.
|
||||
|
||||
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
|
||||
2. Import the sample data:
|
||||
|
||||
```bash
|
||||
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
|
||||
```
|
||||
|
||||
Use the following query to calculate the percentage of the total weight each variety
|
||||
accounts for at each given point in time.
|
||||
|
||||
```js
|
||||
from(bucket: "apple_stand/autogen")
|
||||
|> range(start: 2018-06-18T12:00:00Z, stop: 2018-06-19T04:35:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "variety")
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(
|
||||
fn: (r) => ({r with
|
||||
granny_smith: r.granny_smith / r.total_weight * 100.0,
|
||||
golden_delicious: r.golden_delicious / r.total_weight * 100.0,
|
||||
fuji: r.fuji / r.total_weight * 100.0,
|
||||
gala: r.gala / r.total_weight * 100.0,
|
||||
braeburn: r.braeburn / r.total_weight * 100.0,
|
||||
}),
|
||||
)
|
||||
```
|
||||
|
||||
## Calculate the average percentage of total weight per variety each hour
|
||||
|
||||
With the apple stand data from the prior example, use the following query to calculate the average percentage of the total weight each variety accounts for per hour.
|
||||
|
||||
```js
|
||||
from(bucket: "apple_stand/autogen")
|
||||
|> range(start: 2018-06-18T00:00:00Z, stop: 2018-06-19T16:35:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "variety")
|
||||
|> aggregateWindow(every: 1h, fn: mean)
|
||||
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(
|
||||
fn: (r) => ({r with
|
||||
granny_smith: r.granny_smith / r.total_weight * 100.0,
|
||||
golden_delicious: r.golden_delicious / r.total_weight * 100.0,
|
||||
fuji: r.fuji / r.total_weight * 100.0,
|
||||
gala: r.gala / r.total_weight * 100.0,
|
||||
braeburn: r.braeburn / r.total_weight * 100.0,
|
||||
}),
|
||||
)
|
||||
```
|
||||
|
||||
{{% /tab-content %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
|
||||
[InfluxQL](/enterprise_influxdb/v1.10/query_language/) lets you perform simple math equations
|
||||
which makes calculating percentages using two fields in a measurement pretty simple.
|
||||
However there are some caveats of which you need to be aware.
|
||||
|
||||
## Basic calculations within a query
|
||||
|
||||
`SELECT` statements support the use of basic math operators such as `+`,`-`,`/`, `*`, `()`, etc.
|
||||
|
||||
```sql
|
||||
-- Add two field keys
|
||||
SELECT field_key1 + field_key2 AS "field_key_sum" FROM "measurement_name" WHERE time < now() - 15m
|
||||
|
||||
-- Subtract one field from another
|
||||
SELECT field_key1 - field_key2 AS "field_key_difference" FROM "measurement_name" WHERE time < now() - 15m
|
||||
|
||||
-- Grouping and chaining mathematical calculations
|
||||
SELECT (field_key1 + field_key2) - (field_key3 + field_key4) AS "some_calculation" FROM "measurement_name" WHERE time < now() - 15m
|
||||
```
|
||||
|
||||
## Calculating a percentage in a query
|
||||
|
||||
Using basic math functions, you can calculate a percentage by dividing one field value
|
||||
by another and multiplying the result by 100:
|
||||
|
||||
```sql
|
||||
SELECT (field_key1 / field_key2) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m
|
||||
```
|
||||
|
||||
## Calculating a percentage using aggregate functions
|
||||
|
||||
If using aggregate functions in your percentage calculation, all data must be referenced
|
||||
using aggregate functions.
|
||||
_**You can't mix aggregate and non-aggregate data.**_
|
||||
|
||||
All Aggregate functions need a `GROUP BY time()` clause defining the time intervals
|
||||
in which data points are grouped and aggregated.
|
||||
|
||||
```sql
|
||||
SELECT (sum(field_key1) / sum(field_key2)) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m GROUP BY time(1m)
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
#### Sample data
|
||||
|
||||
The following example uses simulated Apple Stand data that tracks the weight of
|
||||
baskets containing different varieties of apples throughout a day of business.
|
||||
|
||||
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
|
||||
2. Import the sample data:
|
||||
|
||||
```bash
|
||||
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
|
||||
```
|
||||
|
||||
### Calculating percentage of total weight per apple variety
|
||||
|
||||
The following query calculates the percentage of the total weight each variety
|
||||
accounts for at each given point in time.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
("braeburn"/total_weight)*100,
|
||||
("granny_smith"/total_weight)*100,
|
||||
("golden_delicious"/total_weight)*100,
|
||||
("fuji"/total_weight)*100,
|
||||
("gala"/total_weight)*100
|
||||
FROM "apple_stand"."autogen"."variety"
|
||||
```
|
||||
<div class='view-in-chronograf' data-query-override='SELECT
|
||||
("braeburn"/total_weight)*100,
|
||||
("granny_smith"/total_weight)*100,
|
||||
("golden_delicious"/total_weight)*100,
|
||||
("fuji"/total_weight)*100,
|
||||
("gala"/total_weight)*100
|
||||
FROM "apple_stand"."autogen"."variety"'>
|
||||
\*</div>
|
||||
|
||||
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
|
||||
in Chronograf, it would look like:
|
||||
|
||||
![Percentage of total per apple variety](/img/influxdb/1-5-calc-percentage-apple-variety.png)
|
||||
|
||||
### Calculating aggregate percentage per variety
|
||||
|
||||
The following query calculates the average percentage of the total weight each variety
|
||||
accounts for per hour.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
(mean("braeburn")/mean(total_weight))*100,
|
||||
(mean("granny_smith")/mean(total_weight))*100,
|
||||
(mean("golden_delicious")/mean(total_weight))*100,
|
||||
(mean("fuji")/mean(total_weight))*100,
|
||||
(mean("gala")/mean(total_weight))*100
|
||||
FROM "apple_stand"."autogen"."variety"
|
||||
WHERE time >= '2018-06-18T12:00:00Z' AND time <= '2018-06-19T04:35:00Z'
|
||||
GROUP BY time(1h)
|
||||
```
|
||||
<div class='view-in-chronograf' data-query-override='SELECT%0A%20%20%20%20%28mean%28"braeburn"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"granny_smith"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"golden_delicious"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"fuji"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"gala"%29%2Fmean%28total_weight%29%29%2A100%0AFROM%20"apple_stand"."autogen"."variety"%0AWHERE%20time%20>%3D%20%272018-06-18T12%3A00%3A00Z%27%20AND%20time%20<%3D%20%272018-06-19T04%3A35%3A00Z%27%0AGROUP%20BY%20time%281h%29'></div>
|
||||
|
||||
_**Note the following about this query:**_
|
||||
|
||||
- It uses aggregate functions (`mean()`) for pulling all data.
|
||||
- It includes a `GROUP BY time()` clause which aggregates data into 1 hour blocks.
|
||||
- It includes an explicitly limited time window. Without it, aggregate functions
|
||||
are very resource-intensive.
|
||||
|
||||
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
|
||||
in Chronograf, it would look like:
|
||||
|
||||
![Hourly average percentage of total per apple variety](/img/influxdb/1-5-calc-percentage-hourly-apple-variety.png)
|
||||
|
||||
{{% /tab-content %}}
|
||||
{{< /tabs-wrapper >}}
|
|
@ -0,0 +1,224 @@
|
|||
---
|
||||
title: Downsample and retain data
|
||||
description: Downsample data to keep high precision while preserving storage.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 30
|
||||
parent: Guides
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/downsampling_and_retention/
|
||||
v2: /influxdb/v2.0/process-data/common-tasks/downsample-data/
|
||||
---
|
||||
|
||||
InfluxDB can handle hundreds of thousands of data points per second. Working with that much data over a long period of time can create storage concerns.
|
||||
A natural solution is to downsample the data; keep the high precision raw data for only a limited time, and store the lower precision, summarized data longer.
|
||||
This guide describes how to automate the process of downsampling data and expiring old data using InfluxQL. To downsample and retain data using Flux and InfluxDB 2.0,
|
||||
see [Process data with InfluxDB tasks](/influxdb/v2.0/process-data/).
|
||||
|
||||
### Definitions
|
||||
|
||||
- **Continuous query** (CQ) is an InfluxQL query that runs automatically and periodically within a database.
|
||||
CQs require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
|
||||
|
||||
- **Retention policy** (RP) is the part of InfluxDB data structure that describes for how long InfluxDB keeps data.
|
||||
InfluxDB compares your local server's timestamp to the timestamps on your data and deletes data older than the RP's `DURATION`.
|
||||
A single database can have several RPs and RPs are unique per database.
|
||||
|
||||
This guide doesn't go into detail about the syntax for creating and managing CQs and RPs or tasks.
|
||||
If you're new to these concepts, we recommend reviewing the following:
|
||||
|
||||
- [CQ documentation](/enterprise_influxdb/v1.10/query_language/continuous_queries/) and
|
||||
- [RP documentation](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
### Sample data
|
||||
|
||||
This section uses fictional real-time data to track the number of food orders
|
||||
to a restaurant via phone and via website at ten second intervals.
|
||||
We store this data in a [database](/enterprise_influxdb/v1.10/concepts/glossary/#database) or [bucket]() called `food_data`, in
|
||||
the [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement) `orders`, and
|
||||
in the [fields](/enterprise_influxdb/v1.10/concepts/glossary/#field) `phone` and `website`.
|
||||
|
||||
Sample:
|
||||
|
||||
```bash
|
||||
name: orders
|
||||
------------
|
||||
time phone website
|
||||
2016-05-10T23:18:00Z 10 30
|
||||
2016-05-10T23:18:10Z 12 39
|
||||
2016-05-10T23:18:20Z 11 56
|
||||
```
|
||||
|
||||
### Goal
|
||||
|
||||
Assume that, in the long run, we're only interested in the average number of orders by phone
|
||||
and by website at 30 minute intervals.
|
||||
In the next steps, we use RPs and CQs to:
|
||||
|
||||
* Automatically aggregate the ten-second resolution data to 30-minute resolution data
|
||||
* Automatically delete the raw, ten-second resolution data that are older than two hours
|
||||
* Automatically delete the 30-minute resolution data that are older than 52 weeks
|
||||
|
||||
### Database preparation
|
||||
|
||||
We perform the following steps before writing the data to the database
|
||||
`food_data`.
|
||||
We do this **before** inserting any data because CQs only run against recent
|
||||
data; that is, data with timestamps that are no older than `now()` minus
|
||||
the `FOR` clause of the CQ, or `now()` minus the `GROUP BY time()` interval if
|
||||
the CQ has no `FOR` clause.
|
||||
|
||||
#### 1. Create the database
|
||||
|
||||
```sql
|
||||
> CREATE DATABASE "food_data"
|
||||
```
|
||||
|
||||
#### 2. Create a two-hour `DEFAULT` retention policy
|
||||
|
||||
InfluxDB writes to the `DEFAULT` retention policy if we do not supply an explicit RP when
|
||||
writing a point to the database.
|
||||
We make the `DEFAULT` RP keep data for two hours, because we want InfluxDB to
|
||||
automatically write the incoming ten-second resolution data to that RP.
|
||||
|
||||
Use the
|
||||
[`CREATE RETENTION POLICY`](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy)
|
||||
statement to create a `DEFAULT` RP:
|
||||
|
||||
```sql
|
||||
> CREATE RETENTION POLICY "two_hours" ON "food_data" DURATION 2h REPLICATION 1 DEFAULT
|
||||
```
|
||||
|
||||
That query creates an RP called `two_hours` that exists in the database
|
||||
`food_data`.
|
||||
`two_hours` keeps data for a `DURATION` of two hours (`2h`) and it's the `DEFAULT`
|
||||
RP for the database `food_data`.
|
||||
|
||||
{{% warn %}}
|
||||
The replication factor (`REPLICATION 1`) is a required parameter but must always
|
||||
be set to 1 for single node instances.
|
||||
{{% /warn %}}
|
||||
|
||||
> **Note:** When we created the `food_data` database in step 1, InfluxDB
|
||||
automatically generated an RP named `autogen` and set it as the `DEFAULT`
|
||||
RP for the database.
|
||||
The `autogen` RP has an infinite retention period.
|
||||
With the query above, the RP `two_hours` replaces `autogen` as the `DEFAULT` RP
|
||||
for the `food_data` database.
|
||||
|
||||
#### 3. Create a 52-week retention policy
|
||||
|
||||
Next we want to create another retention policy that keeps data for 52 weeks and is not the
|
||||
`DEFAULT` retention policy (RP) for the database.
|
||||
Ultimately, the 30-minute rollup data will be stored in this RP.
|
||||
|
||||
Use the
|
||||
[`CREATE RETENTION POLICY`](/enterprise_influxdb/v1.10/query_language/manage-database/#create-retention-policies-with-create-retention-policy)
|
||||
statement to create a non-`DEFAULT` retention policy:
|
||||
|
||||
```sql
|
||||
> CREATE RETENTION POLICY "a_year" ON "food_data" DURATION 52w REPLICATION 1
|
||||
```
|
||||
|
||||
That query creates a retention policy (RP) called `a_year` that exists in the database
|
||||
`food_data`.
|
||||
The `a_year` setting keeps data for a `DURATION` of 52 weeks (`52w`).
|
||||
Leaving out the `DEFAULT` argument ensures that `a_year` is not the `DEFAULT`
|
||||
RP for the database `food_data`.
|
||||
That is, write and read operations against `food_data` that do not specify an
|
||||
RP will still go to the `two_hours` RP (the `DEFAULT` RP).
|
||||
|
||||
#### 4. Create the continuous query
|
||||
|
||||
Now that we've set up our RPs, we want to create a continuous query (CQ) that will automatically
|
||||
and periodically downsample the ten-second resolution data to the 30-minute
|
||||
resolution, and then store those results in a different measurement with a different
|
||||
retention policy.
|
||||
|
||||
Use the
|
||||
[`CREATE CONTINUOUS QUERY`](/enterprise_influxdb/v1.10/query_language/continuous_queries/)
|
||||
statement to generate a CQ:
|
||||
|
||||
```sql
|
||||
> CREATE CONTINUOUS QUERY "cq_30m" ON "food_data" BEGIN
|
||||
SELECT mean("website") AS "mean_website",mean("phone") AS "mean_phone"
|
||||
INTO "a_year"."downsampled_orders"
|
||||
FROM "orders"
|
||||
GROUP BY time(30m)
|
||||
END
|
||||
```
|
||||
|
||||
That query creates a CQ called `cq_30m` in the database `food_data`.
|
||||
`cq_30m` tells InfluxDB to calculate the 30-minute average of the two fields
|
||||
`website` and `phone` in the measurement `orders` and in the `DEFAULT` RP
|
||||
`two_hours`.
|
||||
It also tells InfluxDB to write those results to the measurement
|
||||
`downsampled_orders` in the retention policy `a_year` with the field keys
|
||||
`mean_website` and `mean_phone`.
|
||||
InfluxDB will run this query every 30 minutes for the previous 30 minutes.
|
||||
|
||||
> **Note:** Notice that we fully qualify (that is, we use the syntax
|
||||
`"<retention_policy>"."<measurement>"`) the measurement in the `INTO`
|
||||
clause.
|
||||
InfluxDB requires that syntax to write data to an RP other than the `DEFAULT`
|
||||
RP.
|
||||
|
||||
### Results
|
||||
|
||||
With the new CQ and two new RPs, `food_data` is ready to start receiving data.
|
||||
After writing data to our database and letting things run for a bit, we see
|
||||
two measurements: `orders` and `downsampled_orders`.
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "orders" LIMIT 5
|
||||
name: orders
|
||||
---------
|
||||
time phone website
|
||||
2016-05-13T23:00:00Z 10 30
|
||||
2016-05-13T23:00:10Z 12 39
|
||||
2016-05-13T23:00:20Z 11 56
|
||||
2016-05-13T23:00:30Z 8 34
|
||||
2016-05-13T23:00:40Z 17 32
|
||||
|
||||
> SELECT * FROM "a_year"."downsampled_orders" LIMIT 5
|
||||
name: downsampled_orders
|
||||
---------------------
|
||||
time mean_phone mean_website
|
||||
2016-05-13T15:00:00Z 12 23
|
||||
2016-05-13T15:30:00Z 13 32
|
||||
2016-05-13T16:00:00Z 19 21
|
||||
2016-05-13T16:30:00Z 3 26
|
||||
2016-05-13T17:00:00Z 4 23
|
||||
```
|
||||
|
||||
The data in `orders` are the raw, ten-second resolution data that reside in the
|
||||
two-hour RP.
|
||||
The data in `downsampled_orders` are the aggregated, 30-minute resolution data
|
||||
that are subject to the 52-week RP.
|
||||
|
||||
Notice that the first timestamps in `downsampled_orders` are older than the first
|
||||
timestamps in `orders`.
|
||||
This is because InfluxDB has already deleted data from `orders` with timestamps
|
||||
that are older than our local server's timestamp minus two hours (assume we
|
||||
executed the `SELECT` queries at `2016-05-14T00:59:59Z`).
|
||||
InfluxDB will only start dropping data from `downsampled_orders` after 52 weeks.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
* Notice that we fully qualify (that is, we use the syntax
|
||||
`"<retention_policy>"."<measurement>"`) `downsampled_orders` in
|
||||
the second `SELECT` statement. We must specify the RP in that query to `SELECT`
|
||||
data that reside in an RP other than the `DEFAULT` RP.
|
||||
>
|
||||
* By default, InfluxDB checks to enforce an RP every 30 minutes.
|
||||
Between checks, `orders` may have data that are older than two hours.
|
||||
The rate at which InfluxDB checks to enforce an RP is a configurable setting,
|
||||
see
|
||||
[Database Configuration](/enterprise_influxdb/v1.10/administration/configure/config-data-nodes/#check-interval).
|
||||
|
||||
Using a combination of RPs and CQs, we've successfully set up our database to
|
||||
automatically keep the high precision raw data for a limited time, create lower
|
||||
precision data, and store that lower precision data for a longer period of time.
|
||||
Now that you have a general understanding of how these features can work
|
||||
together, check out the detailed documentation on [CQs](/enterprise_influxdb/v1.10/query_language/continuous_queries/) and [RPs](/enterprise_influxdb/v1.10/query_language/manage-database/#retention-policy-management)
|
||||
to see all that they can do for you.
|
|
@ -0,0 +1,343 @@
|
|||
---
|
||||
title: Hardware sizing guidelines
|
||||
Description: >
|
||||
Review configuration and hardware guidelines for InfluxDB OSS (open source) and InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 40
|
||||
parent: Guides
|
||||
---
|
||||
|
||||
Review configuration and hardware guidelines for InfluxDB Enterprise:
|
||||
|
||||
* [Enterprise overview](#enterprise-overview)
|
||||
* [Query guidelines](#query-guidelines)
|
||||
* [InfluxDB OSS guidelines](#influxdb-oss-guidelines)
|
||||
* [InfluxDB Enterprise cluster guidelines](#influxdb-enterprise-cluster-guidelines)
|
||||
* [When do I need more RAM?](#when-do-i-need-more-ram)
|
||||
* [Recommended cluster configurations](#recommended-cluster-configurations)
|
||||
* [Storage: type, amount, and configuration](#storage-type-amount-and-configuration)
|
||||
|
||||
For InfluxDB OSS instances, see [OSS hardware sizing guidelines](https://docs.influxdata.com/influxdb/v1.8/guides/hardware_sizing/).
|
||||
|
||||
> **Disclaimer:** Your numbers may vary from recommended guidelines. Guidelines provide estimated benchmarks for implementing the most performant system for your business.
|
||||
|
||||
## Enterprise overview
|
||||
|
||||
InfluxDB Enterprise supports the following:
|
||||
|
||||
- more than 750,000 field writes per second
|
||||
- more than 100 moderate queries per second ([see Query guides](#query-guidelines))
|
||||
- more than 10,000,000 [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality)
|
||||
|
||||
InfluxDB Enterprise distributes multiple copies of your data across a cluster,
|
||||
providing high-availability and redundancy, so an unavailable node doesn’t significantly impact the cluster.
|
||||
Please [contact us](https://www.influxdata.com/contact-sales/) for assistance tuning your system.
|
||||
|
||||
If you want a single node instance of InfluxDB that's fully open source, requires fewer writes, queries, and unique series than listed above, and do **not require** redundancy, we recommend InfluxDB OSS.
|
||||
|
||||
> **Note:** Without the redundancy of a cluster, writes and queries fail immediately when a server is unavailable.
|
||||
|
||||
## Query guidelines
|
||||
|
||||
> Query complexity varies widely on system impact. Recommendations for both single nodes and clusters are based on **moderate** query loads.
|
||||
|
||||
For **simple** or **complex** queries, we recommend testing and adjusting the suggested requirements as needed. Query complexity is defined by the following criteria:
|
||||
|
||||
| Query complexity | Criteria |
|
||||
|:------------------|:---------------------------------------------------------------------------------------|
|
||||
| Simple | Have few or no functions and no regular expressions |
|
||||
| | Are bounded in time to a few minutes, hours, or 24 hours at most |
|
||||
| | Typically execute in a few milliseconds to a few dozen milliseconds |
|
||||
| Moderate | Have multiple functions and one or two regular expressions |
|
||||
| | May also have `GROUP BY` clauses or sample a time range of multiple weeks |
|
||||
| | Typically execute in a few hundred or a few thousand milliseconds |
|
||||
| Complex | Have multiple aggregation or transformation functions or multiple regular expressions |
|
||||
| | May sample a very large time range of months or years |
|
||||
| | Typically take multiple seconds to execute |
|
||||
|
||||
## InfluxDB Enterprise cluster guidelines
|
||||
|
||||
### Meta nodes
|
||||
|
||||
> Set up clusters with an odd number of meta nodes─an even number may cause issues in certain configurations.
|
||||
|
||||
A cluster must have a **minimum of three** independent meta nodes for data redundancy and availability. A cluster with `2n + 1` meta nodes can tolerate the loss of `n` meta nodes.
|
||||
|
||||
Meta nodes do not need very much computing power. Regardless of the cluster load, we recommend the following guidelines for the meta nodes:
|
||||
|
||||
* vCPU or CPU: 1-2 cores
|
||||
* RAM: 512 MB - 1 GB
|
||||
* IOPS: 50
|
||||
|
||||
### Data nodes
|
||||
|
||||
A cluster with one data node is valid but has no data redundancy. Redundancy is set by the [replication factor](/influxdb/v1.8/concepts/glossary/#replication-factor) on the retention policy the data is written to. Where `n` is the replication factor, a cluster can lose `n - 1` data nodes and return complete query results.
|
||||
|
||||
>**Note:** For optimal data distribution within the cluster, use an even number of data nodes.
|
||||
|
||||
Guidelines vary by writes per second per node, moderate queries per second per node, and the number of unique series per node.
|
||||
|
||||
#### Guidelines per node
|
||||
|
||||
| vCPU or CPU | RAM | IOPS | Writes per second | Queries* per second | Unique series |
|
||||
| ----------: | -------: | ----: | ----------------: | ------------------: | ------------: |
|
||||
| 2 cores | 4-8 GB | 1000 | < 5,000 | < 5 | < 100,000 |
|
||||
| 4-6 cores | 16-32 GB | 1000+ | < 100,000 | < 25 | < 1,000,000 |
|
||||
| 8+ cores | 32+ GB | 1000+ | > 100,000 | > 25 | > 1,000,000 |
|
||||
|
||||
* Guidelines are provided for moderate queries. Queries vary widely in their impact on the system. For simple or complex queries, we recommend testing and adjusting the suggested requirements as needed. See [query guidelines](#query-guidelines) for detail.
|
||||
|
||||
## When do I need more RAM?
|
||||
|
||||
In general, more RAM helps queries return faster. Your RAM requirements are primarily determined by [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality). Higher cardinality requires more RAM. Regardless of RAM, a series cardinality of 10 million or more can cause OOM (out of memory) failures. You can usually resolve OOM issues by redesigning your [schema](/influxdb/v1.8/concepts/glossary/#schema).
|
||||
|
||||
|
||||
## Guidelines per cluster
|
||||
|
||||
InfluxDB Enterprise guidelines vary by writes and queries per second, series cardinality, replication factor, and infrastructure-AWS EC2 R4 instances or equivalent:
|
||||
- R4.xlarge (4 cores)
|
||||
- R4.2xlarge (8 cores)
|
||||
- R4.4xlarge (16 cores)
|
||||
- R4.8xlarge (32 cores)
|
||||
|
||||
> Guidelines stem from a DevOps monitoring use case: maintaining a group of computers and monitoring server metrics (such as CPU, kernel, memory, disk space, disk I/O, network, and so on).
|
||||
|
||||
### Recommended cluster configurations
|
||||
|
||||
Cluster configurations guidelines are organized by:
|
||||
|
||||
- Series cardinality in your data set: 10,000, 100,000, 1,000,000, or 10,000,000
|
||||
- Number of data nodes
|
||||
- Number of server cores
|
||||
|
||||
For each cluster configuration, you'll find guidelines for the following:
|
||||
|
||||
- **maximum writes per second only** (no dashboard queries are running)
|
||||
- **maximum queries per second only** (no data is being written)
|
||||
- **maximum simultaneous queries and writes per second, combined**
|
||||
|
||||
#### Review cluster configuration tables
|
||||
|
||||
1. Select the series cardinality tab below, and then click to expand a replication factor.
|
||||
2. In the **Nodes x Core** column, find the number of data nodes and server cores in your configuration, and then review the recommended **maximum** guidelines.
|
||||
|
||||
{{< tabs-wrapper >}}
|
||||
{{% tabs %}}
|
||||
[10,000 series](#)
|
||||
[100,000 series](#)
|
||||
[1,000,000 series](#)
|
||||
[10,000,000 series](#)
|
||||
{{% /tabs %}}
|
||||
{{% tab-content %}}
|
||||
|
||||
Select one of the following replication factors to see the recommended cluster configuration for 10,000 series:
|
||||
|
||||
{{% expand "Replication factor, 1" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 1 x 4 | 188,000 | 5 | 4 + 99,000 |
|
||||
| 1 x 8 | 405,000 | 9 | 8 + 207,000 |
|
||||
| 1 x 16 | 673,000 | 15 | 14 + 375,000 |
|
||||
| 1 x 32 | 1,056,000 | 24 | 22 + 650,000 |
|
||||
| 2 x 4 | 384,000 | 14 | 14 + 184,000 |
|
||||
| 2 x 8 | 746,000 | 22 | 22 + 334,000 |
|
||||
| 2 x 16 | 1,511,000 | 56 | 40 + 878,000 |
|
||||
| 2 x 32 | 2,426,000 | 96 | 68 + 1,746,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% expand "Replication factor, 2" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 2 x 4 | 296,000 | 16 | 16 + 151,000 |
|
||||
| 2 x 8 | 560,000 | 30 | 26 + 290,000 |
|
||||
| 2 x 16 | 972,000 | 54 | 50 + 456,000 |
|
||||
| 2 x 32 | 1,860,000 | 84 | 74 + 881,000 |
|
||||
| 4 x 8 | 1,781,000 | 100 | 64 + 682,000 |
|
||||
| 4 x 16 | 3,430,000 | 192 | 104 + 1,732,000 |
|
||||
| 4 x 32 | 6,351,000 | 432 | 188 + 3,283,000 |
|
||||
| 6 x 8 | 2,923,000 | 216 | 138 + 1,049,000 |
|
||||
| 6 x 16 | 5,650,000 | 498 | 246 + 2,246,000 |
|
||||
| 6 x 32 | 9,842,000 | 1248 | 528 + 5,229,000 |
|
||||
| 8 x 8 | 3,987,000 | 632 | 336 + 1,722,000 |
|
||||
| 8 x 16 | 7,798,000 | 1384 | 544 + 3,911,000 |
|
||||
| 8 x 32 | 13,189,000 | 3648 | 1,152 + 7,891,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% expand "Replication factor, 3" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 3 x 8 | 815,000 | 63 | 54 + 335,000 |
|
||||
| 3 x 16 | 1,688,000 | 120 | 87 + 705,000 |
|
||||
| 3 x 32 | 3,164,000 | 255 | 132 + 1,626,000 |
|
||||
| 6 x 8 | 2,269,000 | 252 | 168 + 838,000 |
|
||||
| 6 x 16 | 4,593,000 | 624 | 336 + 2,019,000 |
|
||||
| 6 x 32 | 7,776,000 | 1340 | 576 + 3,624,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% /tab-content %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
|
||||
Select one of the following replication factors to see the recommended cluster configuration for 100,000 series:
|
||||
|
||||
{{% expand "Replication factor, 1" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 1 x 4 | 143,000 | 5 | 4 + 77,000 |
|
||||
| 1 x 8 | 322,000 | 9 | 8 + 167,000 |
|
||||
| 1 x 16 | 624,000 | 17 | 12 + 337,000 |
|
||||
| 1 x 32 | 1,114,000 | 26 | 18 + 657,000 |
|
||||
| 2 x 4 | 265,000 | 14 | 12 + 115,000 |
|
||||
| 2 x 8 | 573,000 | 30 | 22 + 269,000 |
|
||||
| 2 x 16 | 1,261,000 | 52 | 38 + 679,000 |
|
||||
| 2 x 32 | 2,335,000 | 90 | 66 + 1,510,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% expand "Replication factor, 2" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 2 x 4 | 196,000 | 16 | 14 + 77,000 |
|
||||
| 2 x 8 | 482,000 | 30 | 24 + 203,000 |
|
||||
| 2 x 16 | 1,060,000 | 60 | 42 + 415,000 |
|
||||
| 2 x 32 | 1,958,000 | 94 | 64 + 984,000 |
|
||||
| 4 x 8 | 1,144,000 | 108 | 68 + 406,000 |
|
||||
| 4 x 16 | 2,512,000 | 228 | 148 + 866,000 |
|
||||
| 4 x 32 | 4,346,000 | 564 | 320 + 1,886,000 |
|
||||
| 6 x 8 | 1,802,000 | 252 | 156 + 618,000 |
|
||||
| 6 x 16 | 3,924,000 | 562 | 384 + 1,068,000 |
|
||||
| 6 x 32 | 6,533,000 | 1340 | 912 + 2,083,000 |
|
||||
| 8 x 8 | 2,516,000 | 712 | 360 + 1,020,000 |
|
||||
| 8 x 16 | 5,478,000 | 1632 | 1,024 + 1,843,000 |
|
||||
| 8 x 32 | 1,0527,000 | 3392 | 1,792 + 4,998,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% expand "Replication factor, 3" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 3 x 8 | 616,000 | 72 | 51 + 218,000 |
|
||||
| 3 x 16 | 1,268,000 | 117 | 84 + 438,000 |
|
||||
| 3 x 32 | 2,260,000 | 189 | 114 + 984,000 |
|
||||
| 6 x 8 | 1,393,000 | 294 | 192 + 421,000 |
|
||||
| 6 x 16 | 3,056,000 | 726 | 456 + 893,000 |
|
||||
| 6 x 32 | 5,017,000 | 1584 | 798 + 1,098,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% /tab-content %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
|
||||
Select one of the following replication factors to see the recommended cluster configuration for 1,000,000 series:
|
||||
|
||||
{{% expand "Replication factor, 2" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:-------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 2 x 4 | 104,000 | 18 | 12 + 54,000 |
|
||||
| 2 x 8 | 195,000 | 36 | 24 + 99,000 |
|
||||
| 2 x 16 | 498,000 | 70 | 44 + 145,000 |
|
||||
| 2 x 32 | 1,195,000 | 102 | 84 + 232,000 |
|
||||
| 4 x 8 | 488,000 | 120 | 56 + 222,000 |
|
||||
| 4 x 16 | 1,023,000 | 244 | 112 + 428,000 |
|
||||
| 4 x 32 | 2,686,000 | 468 | 208 + 729,000 |
|
||||
| 6 x 8 | 845,000 | 270 | 126 + 356,000 |
|
||||
| 6 x 16 | 1,780,000 | 606 | 288 + 663,000 |
|
||||
| 6 x 32 | 430,000 | 1,488 | 624 + 1,209,000 |
|
||||
| 8 x 8 | 1,831,000 | 808 | 296 + 778,000 |
|
||||
| 8 x 16 | 4,167,000 | 1,856 | 640 + 2,031,000 |
|
||||
| 8 x 32 | 7,813,000 | 3,201 | 896 + 4,897,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% expand "Replication factor, 3" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 3 x 8 | 234,000 | 72 | 42 + 87,000 |
|
||||
| 3 x 16 | 613,000 | 120 | 75 + 166,000 |
|
||||
| 3 x 32 | 1,365,000 | 141 | 114 + 984,000 |
|
||||
| 6 x 8 | 593,000 | 318 | 144 + 288,000 |
|
||||
| 6 x 16 | 1,545,000 | 744 | 384 + 407,000 |
|
||||
| 6 x 32 | 3,204,000 | 1632 | 912 + 505,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% /tab-content %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
|
||||
Select one of the following replication factors to see the recommended cluster configuration for 10,000,000 series:
|
||||
|
||||
{{% expand "Replication factor, 1" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 2 x 4 | 122,000 | 16 | 12 + 81,000 |
|
||||
| 2 x 8 | 259,000 | 36 | 24 + 143,000 |
|
||||
| 2 x 16 | 501,000 | 66 | 44 + 290,000 |
|
||||
| 2 x 32 | 646,000 | 142 | 54 + 400,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% expand "Replication factor, 2" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 2 x 4 | 87,000 | 18 | 14 + 56,000 |
|
||||
| 2 x 8 | 169,000 | 38 | 24 + 98,000 |
|
||||
| 2 x 16 | 334,000 | 76 | 46 + 224,000 |
|
||||
| 2 x 32 | 534,000 | 136 | 58 + 388,000 |
|
||||
| 4 x 8 | 335,000 | 120 | 60 + 204,000 |
|
||||
| 4 x 16 | 643,000 | 256 | 112 + 395,000 |
|
||||
| 4 x 32 | 967,000 | 560 | 158 + 806,000 |
|
||||
| 6 x 8 | 521,000 | 378 | 144 + 319,000 |
|
||||
| 6 x 16 | 890,000 | 582 | 186 + 513,000 |
|
||||
| 8 x 8 | 699,000 | 1,032 | 256 + 477,000 |
|
||||
| 8 x 16 | 1,345,000 | 2,048 | 544 + 741,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% expand "Replication factor, 3" %}}
|
||||
|
||||
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|
||||
|:------------:|------------------:|-------------------:|:---------------------------:|
|
||||
| 3 x 8 | 170,000 | 60 | 42 + 98,000 |
|
||||
| 3 x 16 | 333,000 | 129 | 76 + 206,000 |
|
||||
| 3 x 32 | 609,000 | 178 | 60 + 162,000 |
|
||||
| 6 x 8 | 395,000 | 402 | 132 + 247,000 |
|
||||
| 6 x 16 | 679,000 | 894 | 150 + 527,000 |
|
||||
|
||||
{{% /expand %}}
|
||||
|
||||
{{% /tab-content %}}
|
||||
{{< /tabs-wrapper >}}
|
||||
|
||||
## Storage: type, amount, and configuration
|
||||
|
||||
### Storage volume and IOPS
|
||||
|
||||
Consider the type of storage you need and the amount. InfluxDB is designed to run on solid state drives (SSDs) and memory-optimized cloud instances, for example, AWS EC2 R5 or R4 instances. InfluxDB isn't tested on hard disk drives (HDDs) and we don't recommend HDDs for production. For best results, InfluxDB servers must have a minimum of 1000 IOPS on storage to ensure recovery and availability. We recommend at least 2000 IOPS for rapid recovery of cluster data nodes after downtime.
|
||||
|
||||
See your cloud provider documentation for IOPS detail on your storage volumes.
|
||||
|
||||
### Bytes and compression
|
||||
|
||||
Database names, [measurements](/influxdb/v1.8/concepts/glossary/#measurement), [tag keys](/influxdb/v1.8/concepts/glossary/#tag-key), [field keys](/influxdb/v1.8/concepts/glossary/#field-key), and [tag values](/influxdb/v1.8/concepts/glossary/#tag-value) are stored only once and always as strings. [Field values](/influxdb/v1.8/concepts/glossary/#field-value) and [timestamps](/influxdb/v1.8/concepts/glossary/#timestamp) are stored for every point.
|
||||
|
||||
Non-string values require approximately three bytes. String values require variable space, determined by string compression.
|
||||
|
||||
### Separate `wal` and `data` directories
|
||||
|
||||
When running InfluxDB in a production environment, store the `wal` directory and the `data` directory on separate storage devices. This optimization significantly reduces disk contention under heavy write load──an important consideration if the write load is highly variable. If the write load does not vary by more than 15%, the optimization is probably not necessary.
|
|
@ -0,0 +1,322 @@
|
|||
---
|
||||
title: Migrate InfluxDB OSS instances to InfluxDB Enterprise clusters
|
||||
description: >
|
||||
Migrate a running instance of InfluxDB open source (OSS) to an InfluxDB Enterprise cluster.
|
||||
aliases:
|
||||
- /enterprise/v1.10/guides/migration/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Migrate InfluxDB OSS to Enterprise
|
||||
weight: 10
|
||||
parent: Guides
|
||||
---
|
||||
|
||||
Migrate a running instance of InfluxDB open source (OSS) to an InfluxDB Enterprise cluster.
|
||||
|
||||
{{% note %}}
|
||||
Migration transfers all users from the OSS instance to the InfluxDB Enterprise cluster.
|
||||
{{% /note %}}
|
||||
|
||||
## Migrate an OSS instance to InfluxDB Enterprise
|
||||
|
||||
Complete the following tasks
|
||||
to migrate data from OSS to an InfluxDB Enterprise cluster without downtime or missing data.
|
||||
|
||||
1. Upgrade InfluxDB OSS and InfluxDB Enterprise to the latest stable versions.
|
||||
- [Upgrade InfluxDB OSS](/{{< latest "influxdb" "v1" >}}/administration/upgrading/)
|
||||
- [Upgrade InfluxDB Enterprise](/enterprise_influxdb/v1.10/administration/upgrading/)
|
||||
|
||||
2. On each meta node and each data node,
|
||||
add the IP and hostname of your OSS instance to the `/etc/hosts` file.
|
||||
This will allow the nodes to communicate with the OSS instance.
|
||||
|
||||
3. On the OSS instance, take a portable backup from OSS using the **influxd backup** command
|
||||
with the `-portable` flag:
|
||||
|
||||
```sh
|
||||
influxd backup -portable -host <IP address>:8088 /tmp/mysnapshot
|
||||
```
|
||||
|
||||
Note the current date and time when you take the backup.
|
||||
For more information, see [influxd backup](/influxdb/v1.8/tools/influxd/backup/).
|
||||
|
||||
4. Restore the backup on the cluster by running the following:
|
||||
|
||||
```sh
|
||||
influxd-ctl restore [ -host <host:port> ] <path-to-backup-files>
|
||||
```
|
||||
|
||||
> **Note:** InfluxDB Enterprise uses the **influxd-ctl utility** to back up and restore data. For more information,
|
||||
see [influxd-ctl](/enterprise_influxdb/v1.10/tools/influxd-ctl)
|
||||
and [`restore`](/enterprise_influxdb/v1.10/administration/backup-and-restore/#restore).
|
||||
|
||||
5. To avoid data loss, dual write to both OSS and Enterprise while completing the remaining steps.
|
||||
This keeps the OSS and cluster active for testing and acceptance work. For more information, see [Write data with the InfluxDB API](/enterprise_influxdb/v1.10/guides/write_data/).
|
||||
|
||||
|
||||
6. [Export data from OSS](/enterprise_influxdb/v1.10/administration/backup-and-restore/#exporting-data)
|
||||
from the time the backup was taken to the time the dual write started.
|
||||
For example, if you take the backup on `2020-07-19T00:00:00.000Z`,
|
||||
and started writing data to Enterprise at `2020-07-19T23:59:59.999Z`,
|
||||
you would run the following command:
|
||||
|
||||
```sh
|
||||
influx_inspect export -compress -start 2020-07-19T00:00:00.000Z -end 2020-07-19T23:59:59.999Z`
|
||||
```
|
||||
|
||||
For more information, see [`-export`](/enterprise_influxdb/v1.10/tools/influx_inspect#export).
|
||||
|
||||
7. [Import data into Enterprise](/enterprise_influxdb/v1.10/administration/backup-and-restore/#importing-data).
|
||||
|
||||
8. Verify data is successfully migrated to your Enterprise cluster. See:
|
||||
- [Query data with the InfluxDB API](/enterprise_influxdb/v1.10/guides/query_data/)
|
||||
- [View data in Chronograf](/{{< latest "chronograf" >}}/)
|
||||
|
||||
Next, stop writes to OSS instance, and remove it.
|
||||
|
||||
#### Stop writes and remove OSS
|
||||
|
||||
1. Stop all writes to the InfluxDB OSS instance.
|
||||
2. Stop the `influxdb` service on the InfluxDB OSS instance server.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo service influxdb stop
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo systemctl stop influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
3. Double check that the service is stopped.
|
||||
The following command should return nothing:
|
||||
|
||||
```bash
|
||||
ps ax | grep influxd
|
||||
```
|
||||
|
||||
4. Remove the InfluxDB OSS package.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[Debian & Ubuntu](#)
|
||||
[RHEL & CentOS](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo apt-get remove influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo yum remove influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
<!--
|
||||
### Migrate a data set with downtime
|
||||
|
||||
1. [Stop writes and remove OSS](#stop-writes-and-remove-oss)
|
||||
2. [Back up OSS configuration](#back-up-oss-configuration)
|
||||
3. [Add the upgraded OSS instance to the InfluxDB Enterprise cluster](#add-the-new-data-node-to-the-cluster)
|
||||
4. [Add existing data nodes back to the cluster](#add-existing-data-nodes-back-to-the-cluster)
|
||||
5. [Rebalance the cluster](#rebalance-the-cluster)
|
||||
|
||||
#### Stop writes and remove OSS
|
||||
|
||||
1. Stop all writes to the InfluxDB OSS instance.
|
||||
2. Stop the `influxdb` service on the InfluxDB OSS instance.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo service influxdb stop
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo systemctl stop influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
Double check that the service is stopped.
|
||||
The following command should return nothing:
|
||||
|
||||
```bash
|
||||
ps ax | grep influxd
|
||||
```
|
||||
|
||||
3. Remove the InfluxDB OSS package.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[Debian & Ubuntu](#)
|
||||
[RHEL & CentOS](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo apt-get remove influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo yum remove influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
#### Back up and migrate your InfluxDB OSS configuration file
|
||||
|
||||
1. **Back up your InfluxDB OSS configuration file**.
|
||||
If you have custom configuration settings for InfluxDB OSS, back up and save your configuration file.
|
||||
|
||||
{{% warn %}}
|
||||
Without a backup, you'll lose custom configuration settings when updating the InfluxDB binary.
|
||||
{{% /warn %}}
|
||||
|
||||
2. **Update the InfluxDB binary**.
|
||||
|
||||
> Updating the InfluxDB binary overwrites the existing configuration file.
|
||||
> To keep custom settings, back up your configuration file.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[Debian & Ubuntu](#)
|
||||
[RHEL & CentOS](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
sudo dpkg -i influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
sudo yum localinstall influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
3. **Update the configuration file**.
|
||||
|
||||
In `/etc/influxdb/influxdb.conf`:
|
||||
|
||||
- set `hostname` to the full hostname of the data node
|
||||
- set `license-key` in the `[enterprise]` section to the license key you received on InfluxPortal
|
||||
**or** set `license-path` in the `[enterprise]` section to
|
||||
the local path to the JSON license file you received from InfluxData.
|
||||
|
||||
{{% warn %}}
|
||||
The `license-key` and `license-path` settings are mutually exclusive and one must remain set to an empty string.
|
||||
{{% /warn %}}
|
||||
|
||||
```toml
|
||||
# Hostname advertised by this host for remote addresses.
|
||||
# This must be accessible to all nodes in the cluster.
|
||||
hostname="<data-node-hostname>"
|
||||
|
||||
[enterprise]
|
||||
# license-key and license-path are mutually exclusive,
|
||||
# use only one and leave the other blank
|
||||
license-key = "<your_license_key>"
|
||||
license-path = "/path/to/readable/JSON.license.file"
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Transfer any custom settings from the backup of your OSS configuration file
|
||||
to the new Enterprise configuration file.
|
||||
{{% /note %}}
|
||||
|
||||
4. **Update the `/etc/hosts` file**.
|
||||
|
||||
Add all meta and data nodes to the `/etc/hosts` file to allow the OSS instance
|
||||
to communicate with other nodes in the InfluxDB Enterprise cluster.
|
||||
|
||||
5. **Start the data node**.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo service influxdb start
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
sudo systemctl start influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
#### Add the new data node to the cluster
|
||||
|
||||
After you upgrade your OSS instance to InfluxDB Enterprise, add the node to your Enterprise cluster.
|
||||
|
||||
From a **meta** node in the cluster, run:
|
||||
|
||||
```bash
|
||||
influxd-ctl add-data <new-data-node-hostname>:8088
|
||||
```
|
||||
|
||||
The output should look like:
|
||||
|
||||
```bash
|
||||
Added data node y at new-data-node-hostname:8088
|
||||
```
|
||||
|
||||
#### Add existing data nodes back to the cluster
|
||||
|
||||
If you removed any existing data nodes from your InfluxDB Enterprise cluster,
|
||||
add them back to the cluster.
|
||||
|
||||
1. From a **meta** node in the InfluxDB Enterprise cluster, run the following for
|
||||
**each data node**:
|
||||
|
||||
```bash
|
||||
influxd-ctl add-data <the-hostname>:8088
|
||||
```
|
||||
|
||||
It should output:
|
||||
|
||||
```bash
|
||||
Added data node y at the-hostname:8088
|
||||
```
|
||||
|
||||
2. Verify that all nodes are now members of the cluster as expected:
|
||||
|
||||
```bash
|
||||
influxd-ctl show
|
||||
```
|
||||
|
||||
Once added to the cluster, InfluxDB synchronizes data stored on the upgraded OSS
|
||||
node with other data nodes in the cluster.
|
||||
It may take a few minutes before the existing data is available.
|
||||
|
||||
-->
|
||||
## Rebalance the cluster
|
||||
|
||||
1. Use the [`ALTER RETENTION POLICY`](/enterprise_influxdb/v1.10/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy)
|
||||
statement to increase the [replication factor](/enterprise_influxdb/v1.10/concepts/glossary/#replication-factor)
|
||||
on all existing retention polices to the number of data nodes in your cluster.
|
||||
2. [Rebalance your cluster manually](/enterprise_influxdb/v1.10/guides/rebalance/)
|
||||
to meet the desired replication factor for existing shards.
|
||||
3. If you were using [Chronograf](/{{< latest "chronograf" >}}/),
|
||||
add your Enterprise instance as a new data source.
|
|
@ -0,0 +1,111 @@
|
|||
---
|
||||
title: Query data with the InfluxDB API
|
||||
description: Query data with Flux and InfluxQL in the InfluxDB API.
|
||||
alias:
|
||||
-/docs/v1.8/query_language/querying_data/
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 20
|
||||
parent: Guides
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/querying_data/
|
||||
v2: /influxdb/v2.0/query-data/
|
||||
---
|
||||
|
||||
|
||||
The InfluxDB API is the primary means for querying data in InfluxDB (see the [command line interface](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/) and [client libraries](/enterprise_influxdb/v1.10/tools/api_client_libraries/) for alternative ways to query the database).
|
||||
|
||||
Query data with the InfluxDB API using [Flux](#query-data-with-flux) or [InfluxQL](#query-data-with-influxql).
|
||||
|
||||
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
|
||||
|
||||
## Query data with Flux
|
||||
|
||||
For Flux queries, the `/api/v2/query` endpoint accepts `POST` HTTP requests. Use the following HTTP headers:
|
||||
- `Accept: application/csv`
|
||||
- `Content-type: application/vnd.flux`
|
||||
|
||||
If you have authentication enabled, provide your InfluxDB username and password with the `Authorization` header and `Token` schema. For example: `Authorization: Token username:password`.
|
||||
|
||||
|
||||
The following example queries Telegraf data using Flux:
|
||||
:
|
||||
|
||||
```bash
|
||||
$ curl -XPOST localhost:8086/api/v2/query -sS \
|
||||
-H 'Accept:application/csv' \
|
||||
-H 'Content-type:application/vnd.flux' \
|
||||
-d 'from(bucket:"telegraf")
|
||||
|> range(start:-5m)
|
||||
|> filter(fn:(r) => r._measurement == "cpu")'
|
||||
```
|
||||
Flux returns [annotated CSV](/influxdb/v2.0/reference/syntax/annotated-csv/):
|
||||
|
||||
```
|
||||
{,result,table,_start,_stop,_time,_value,_field,_measurement,cpu,host
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:19Z,4.152553004641827,usage_user,cpu,cpu-total,host1
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:29Z,7.608695652173913,usage_user,cpu,cpu-total,host1
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:39Z,2.9363988504310883,usage_user,cpu,cpu-total,host1
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:49Z,6.915093159934975,usage_user,cpu,cpu-total,host1}
|
||||
```
|
||||
|
||||
The header row defines column labels for the table. The `cpu` [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement) has four points, each represented by one of the record rows. For example the first point has a [timestamp](/enterprise_influxdb/v1.10/concepts/glossary/#timestamp) of `2020-04-07T18:08:19`.
|
||||
|
||||
### Flux
|
||||
|
||||
Check out the [Get started with Flux](/influxdb/v2.0/query-data/get-started/) to learn more about building queries with Flux.
|
||||
For more information about querying data with the InfluxDB API using Flux, see the [API reference documentation](/enterprise_influxdb/v1.10/tools/api/#influxdb-2-0-api-compatibility-endpoints).
|
||||
|
||||
## Query data with InfluxQL
|
||||
|
||||
To perform an InfluxQL query, send a `GET` request to the `/query` endpoint, set the URL parameter `db` as the target database, and set the URL parameter `q` as your query.
|
||||
You can also use a `POST` request by sending the same parameters either as URL parameters or as part of the body with `application/x-www-form-urlencoded`.
|
||||
The example below uses the InfluxDB API to query the same database that you encountered in [Writing Data](/enterprise_influxdb/v1.10/guides/writing_data/).
|
||||
|
||||
```bash
|
||||
curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM \"cpu_load_short\" WHERE \"region\"='us-west'"
|
||||
```
|
||||
|
||||
InfluxDB returns JSON:
|
||||
|
||||
|
||||
```json
|
||||
{
|
||||
"results": [
|
||||
{
|
||||
"statement_id": 0,
|
||||
"series": [
|
||||
{
|
||||
"name": "cpu_load_short",
|
||||
"columns": [
|
||||
"time",
|
||||
"value"
|
||||
],
|
||||
"values": [
|
||||
[
|
||||
"2015-01-29T21:55:43.702900257Z",
|
||||
2
|
||||
],
|
||||
[
|
||||
"2015-01-29T21:55:43.702900257Z",
|
||||
0.55
|
||||
],
|
||||
[
|
||||
"2015-06-11T20:46:02Z",
|
||||
0.64
|
||||
]
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
> **Note:** Appending `pretty=true` to the URL enables pretty-printed JSON output.
|
||||
While this is useful for debugging or when querying directly with tools like `curl`, it is not recommended for production use as it consumes unnecessary network bandwidth.
|
||||
|
||||
### InfluxQL
|
||||
|
||||
Check out the [Data Exploration page](/enterprise_influxdb/v1.10/query_language/explore-data/) to get acquainted with InfluxQL.
|
||||
For more information about querying data with the InfluxDB API using InfluxQL, see the [API reference documentation](/enterprise_influxdb/v1.10/tools/api/#influxdb-1-x-http-endpoints).
|
|
@ -0,0 +1,190 @@
|
|||
---
|
||||
title: Write data with the InfluxDB API
|
||||
description: >
|
||||
Use the command line interface (CLI) to write data into InfluxDB with the API.
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
weight: 10
|
||||
parent: Guides
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.10/guides/writing_data/
|
||||
v2: /influxdb/v2.0/write-data/
|
||||
---
|
||||
|
||||
Write data into InfluxDB using the [command line interface](/enterprise_influxdb/v1.10/tools/influx-cli/use-influx/), [client libraries](/enterprise_influxdb/v1.10/clients/api/), and plugins for common data formats such as [Graphite](/enterprise_influxdb/v1.10/write_protocols/graphite/).
|
||||
|
||||
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
|
||||
|
||||
### Create a database using the InfluxDB API
|
||||
|
||||
To create a database send a `POST` request to the `/query` endpoint and set the URL parameter `q` to `CREATE DATABASE <new_database_name>`.
|
||||
The example below sends a request to InfluxDB running on `localhost` and creates the `mydb` database:
|
||||
|
||||
```bash
|
||||
curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE mydb"
|
||||
```
|
||||
|
||||
### Write data using the InfluxDB API
|
||||
|
||||
The InfluxDB API is the primary means of writing data into InfluxDB.
|
||||
|
||||
- To **write to a database using the InfluxDB 1.8 API**, send `POST` requests to the `/write` endpoint. For example, to write a single point to the `mydb` database.
|
||||
The data consists of the [measurement](/enterprise_influxdb/v1.10/concepts/glossary/#measurement) `cpu_load_short`, the [tag keys](/enterprise_influxdb/v1.10/concepts/glossary/#tag-key) `host` and `region` with the [tag values](/enterprise_influxdb/v1.10/concepts/glossary/#tag-value) `server01` and `us-west`, the [field key](/enterprise_influxdb/v1.10/concepts/glossary/#field-key) `value` with a [field value](/enterprise_influxdb/v1.10/concepts/glossary/#field-value) of `0.64`, and the [timestamp](/enterprise_influxdb/v1.10/concepts/glossary/#timestamp) `1434055562000000000`.
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=mydb'
|
||||
--data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
|
||||
```
|
||||
|
||||
- To **write to a database using the InfluxDB 2.0 API (compatible with InfluxDB 1.8+)**, send `POST` requests to the [`/api/v2/write` endpoint](/enterprise_influxdb/v1.10/tools/api/#api-v2-write-http-endpoint):
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/api/v2/write?bucket=db/rp&precision=ns' \
|
||||
--header 'Authorization: Token username:password' \
|
||||
--data-raw 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
|
||||
```
|
||||
|
||||
When writing points, you must specify an existing database in the `db` query parameter.
|
||||
Points will be written to `db`'s default retention policy if you do not supply a retention policy via the `rp` query parameter.
|
||||
See the [InfluxDB API Reference](/enterprise_influxdb/v1.10/tools/api/#write-http-endpoint) documentation for a complete list of the available query parameters.
|
||||
|
||||
The body of the POST or [InfluxDB line protocol](/enterprise_influxdb/v1.10/concepts/glossary/#influxdb-line-protocol) contains the time series data that you want to store. Data includes:
|
||||
|
||||
- **Measurement (required)**
|
||||
- **Tags**: Strictly speaking, tags are optional but most series include tags to differentiate data sources and to make querying both easy and efficient.
|
||||
Both tag keys and tag values are strings.
|
||||
- **Fields (required)**: Field keys are required and are always strings, and, [by default](/enterprise_influxdb/v1.10/write_protocols/line_protocol_reference/#data-types), field values are floats.
|
||||
- **Timestamp**: Supplied at the end of the line in Unix time in nanoseconds since January 1, 1970 UTC - is optional. If you do not specify a timestamp, InfluxDB uses the server's local nanosecond timestamp in Unix epoch.
|
||||
Time in InfluxDB is in UTC format by default.
|
||||
|
||||
> **Note:** Avoid using the following reserved keys: `_field`, `_measurement`, and `time`. If reserved keys are included as a tag or field key, the associated point is discarded.
|
||||
|
||||
### Configure gzip compression
|
||||
|
||||
InfluxDB supports gzip compression. To reduce network traffic, consider the following options:
|
||||
|
||||
* To accept compressed data from InfluxDB, add the `Accept-Encoding: gzip` header to InfluxDB API requests.
|
||||
|
||||
* To compress data before sending it to InfluxDB, add the `Content-Encoding: gzip` header to InfluxDB API requests.
|
||||
|
||||
For details about enabling gzip for client libraries, see your client library documentation.
|
||||
|
||||
#### Enable gzip compression in the Telegraf InfluxDB output plugin
|
||||
|
||||
* In the Telegraf configuration file (telegraf.conf), under [[outputs.influxdb]], change
|
||||
`content_encoding = "identity"` (default) to `content_encoding = "gzip"`
|
||||
|
||||
>**Note**
|
||||
Writes to InfluxDB 2.x [[outputs.influxdb_v2]] are configured to compress content in gzip format by default.
|
||||
|
||||
### Writing multiple points
|
||||
|
||||
Post multiple points to multiple series at the same time by separating each point with a new line.
|
||||
Batching points in this manner results in much higher performance.
|
||||
|
||||
The following example writes three points to the database `mydb`.
|
||||
The first point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02` and has the server's local timestamp.
|
||||
The second point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02,region=us-west` and has the specified timestamp `1422568543702900257`.
|
||||
The third point has the same specified timestamp as the second point, but it is written to the series with the measurement `cpu_load_short` and tag set `direction=in,host=server01,region=us-west`.
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67
|
||||
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
|
||||
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257'
|
||||
```
|
||||
|
||||
### Writing points from a file
|
||||
|
||||
Write points from a file by passing `@filename` to `curl`.
|
||||
The data in the file should follow the [InfluxDB line protocol syntax](/enterprise_influxdb/v1.10/write_protocols/write_syntax/).
|
||||
|
||||
Example of a properly-formatted file (`cpu_data.txt`):
|
||||
|
||||
```txt
|
||||
cpu_load_short,host=server02 value=0.67
|
||||
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
|
||||
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257
|
||||
```
|
||||
|
||||
Write the data in `cpu_data.txt` to the `mydb` database with:
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary @cpu_data.txt
|
||||
```
|
||||
|
||||
> **Note:** If your data file has more than 5,000 points, it may be necessary to split that file into several files in order to write your data in batches to InfluxDB.
|
||||
By default, the HTTP request times out after five seconds.
|
||||
InfluxDB will still attempt to write the points after that time out but there will be no confirmation that they were successfully written.
|
||||
|
||||
### Schemaless Design
|
||||
|
||||
InfluxDB is a schemaless database.
|
||||
You can add new measurements, tags, and fields at any time.
|
||||
Note that if you attempt to write data with a different type than previously used (for example, writing a string to a field that previously accepted integers), InfluxDB will reject those data.
|
||||
|
||||
### A note on REST
|
||||
|
||||
InfluxDB uses HTTP solely as a convenient and widely supported data transfer protocol.
|
||||
|
||||
Modern web APIs have settled on REST because it addresses a common need.
|
||||
As the number of endpoints grows the need for an organizing system becomes pressing.
|
||||
REST is the industry agreed style for organizing large numbers of endpoints.
|
||||
This consistency is good for those developing and consuming the API: everyone involved knows what to expect.
|
||||
|
||||
REST, however, is a convention.
|
||||
InfluxDB makes do with three API endpoints.
|
||||
This simple, easy to understand system uses HTTP as a transfer method for [InfluxQL](/enterprise_influxdb/v1.10/query_language/spec/).
|
||||
The InfluxDB API makes no attempt to be RESTful.
|
||||
|
||||
### HTTP response summary
|
||||
|
||||
* 2xx: If your write request received `HTTP 204 No Content`, it was a success!
|
||||
* 4xx: InfluxDB could not understand the request.
|
||||
* 5xx: The system is overloaded or significantly impaired.
|
||||
|
||||
#### Examples
|
||||
|
||||
##### Writing a float to a field that previously accepted booleans
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=true'
|
||||
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=5'
|
||||
```
|
||||
|
||||
returns:
|
||||
|
||||
```bash
|
||||
HTTP/1.1 400 Bad Request
|
||||
Content-Type: application/json
|
||||
Request-Id: [...]
|
||||
X-Influxdb-Version: {{< latest-patch >}}
|
||||
Date: Wed, 01 Mar 2017 19:38:01 GMT
|
||||
Content-Length: 150
|
||||
|
||||
{"error":"field type conflict: input field \"booleanonly\" on measurement \"tobeornottobe\" is type float, already exists as type boolean dropped=1"}
|
||||
```
|
||||
|
||||
##### Writing a point to a database that doesn't exist
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=atlantis' --data-binary 'liters value=10'
|
||||
```
|
||||
|
||||
returns:
|
||||
|
||||
```bash
|
||||
HTTP/1.1 404 Not Found
|
||||
Content-Type: application/json
|
||||
Request-Id: [...]
|
||||
X-Influxdb-Version: {{< latest-patch >}}
|
||||
Date: Wed, 01 Mar 2017 19:38:35 GMT
|
||||
Content-Length: 45
|
||||
|
||||
{"error":"database not found: \"atlantis\""}
|
||||
```
|
||||
|
||||
### Next steps
|
||||
|
||||
Now that you know how to write data with the InfluxDB API, discover how to query them with the [Querying data](/enterprise_influxdb/v1.10/guides/querying_data/) guide!
|
||||
For more information about writing data with the InfluxDB API, please see the [InfluxDB API reference](/enterprise_influxdb/v1.10/tools/api/#write-http-endpoint).
|
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
title: Introducing InfluxDB Enterprise
|
||||
description: Tasks required to get up and running with InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/introduction/
|
||||
weight: 2
|
||||
|
||||
menu:
|
||||
enterprise_influxdb_1_10:
|
||||
name: Introduction
|
||||
|
||||
---
|
||||
|
||||
{{< children >}}
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue