Publish InfluxDB Enterprise 1.9 docs (#2627)
This update involved a large migration of content from InfluxDB 1.8 to Enterprise 1.9. Co-authored-by: pierwill <pierwill@users.noreply.github.com> Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> Co-authored-by: timhallinflux <timhallinflux@users.noreply.github.com>pull/2755/head
parent
94c477bc44
commit
e5ab8313ec
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
title: InfluxDB Enterprise 1.9 documentation
|
||||
description: >
|
||||
Documentation for InfluxDB Enterprise, which adds clustering, high availability, fine-grained authorization, and more to InfluxDB OSS.
|
||||
aliases:
|
||||
- /enterprise/v1.9/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: InfluxDB Enterprise v1.9
|
||||
weight: 1
|
||||
---
|
||||
|
||||
InfluxDB Enterprise provides a time series database designed to handle high write and query loads and offers highly scalable clusters on your infrastructure with a management UI. Use for DevOps monitoring, IoT sensor data, and real-time analytics. Check out the key features that make InfluxDB Enterprise a great choice for working with time series data.
|
||||
|
||||
If you're interested in working with InfluxDB Enterprise, visit
|
||||
[InfluxPortal](https://portal.influxdata.com/) to sign up, get a license key,
|
||||
and get started!
|
||||
|
||||
## Key features
|
||||
|
||||
- High performance datastore written specifically for time series data. High ingest speed and data compression.
|
||||
- Provides high availability across your cluster and eliminates a single point of failure.
|
||||
- Written entirely in Go. Compiles into a single binary with no external dependencies.
|
||||
- Simple, high performing write and query HTTP APIs.
|
||||
- Plugin support for other data ingestion protocols such as Graphite, collectd, and OpenTSDB.
|
||||
- Expressive SQL-like query language tailored to easily query aggregated data.
|
||||
- Continuous queries automatically compute aggregate data to make frequent queries more efficient.
|
||||
- Tags let you index series for fast and efficient queries.
|
||||
- Retention policies efficiently auto-expire stale data.
|
||||
|
||||
## Next steps
|
||||
|
||||
- [Install and deploy](/enterprise_influxdb/v1.9/install-and-deploy/)
|
||||
- Review key [concepts](/enterprise_influxdb/v1.9/concepts/)
|
||||
- [Get started](/enterprise_influxdb/v1.9/introduction/getting-started/)
|
||||
|
||||
<!-- Monitor your cluster
|
||||
- Manage queries
|
||||
- Manage users
|
||||
- Explore and visualize your data
|
||||
-->
|
|
@ -0,0 +1,49 @@
|
|||
---
|
||||
title: About the project
|
||||
description: >
|
||||
Release notes, licenses, and third-party software details for InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_9_ref:
|
||||
weight: 10
|
||||
---
|
||||
|
||||
{{< children hlevel="h2" >}}
|
||||
|
||||
## Commercial license
|
||||
|
||||
InfluxDB Enterprise is available with a commercial license. [Contact sales for more information](https://www.influxdata.com/contact-sales/).
|
||||
|
||||
## Third party software
|
||||
|
||||
InfluxData products contain third party software, which means the copyrighted, patented, or otherwise legally protected
|
||||
software of third parties, that is incorporated in InfluxData products.
|
||||
|
||||
Third party suppliers make no representation nor warranty with respect to such third party software or any portion thereof.
|
||||
Third party suppliers assume no liability for any claim that might arise with respect to such third party software, nor for a
|
||||
customer’s use of or inability to use the third party software.
|
||||
|
||||
In addition to third party software incorporated in InfluxDB, InfluxDB Enterprise incorporates the following additional third party software:
|
||||
|
||||
| Third Party / Open Source Software - Description | License Type |
|
||||
| ---------------------------------------- | ---------------------------------------- |
|
||||
| [Go language library for exporting performance and runtime metrics to external metrics systems (i.e., statsite, statsd)](https://github.com/armon/go-metrics) (armon/go-metrics) | [MIT](https://github.com/armon/go-metrics/blob/master/LICENSE) |
|
||||
| [Golang implementation of JavaScript Object](https://github.com/dvsekhvalnov/jose2go) (dvsekhvalnov/jose2go) | [MIT](https://github.com/dvsekhvalnov/jose2go/blob/master/LICENSE) |
|
||||
| [Collection of useful handlers for Go net/http package ](https://github.com/gorilla/handlers) (gorilla/handlers) | [BSD-2](https://github.com/gorilla/handlers/blob/master/LICENSE) |
|
||||
| [A powerful URL router and dispatcher for golang](https://github.com/gorilla/mux) (gorilla/mux) | [BSD-3](https://github.com/gorilla/mux/blob/master/LICENSE) |
|
||||
| [Golang connection multiplexing library](https://github.com/hashicorp/yamux/) (hashicorp/yamux) | [Mozilla 2.0](https://github.com/hashicorp/yamux/blob/master/LICENSE) |
|
||||
| [Codec - a high performance and feature-rich Idiomatic encode/decode and rpc library for msgpack and Binc](https://github.com/hashicorp/go-msgpack) (hashicorp/go-msgpack) | [BSD-3](https://github.com/hashicorp/go-msgpack/blob/master/LICENSE) |
|
||||
| [Go language implementation of the Raft consensus protocol](https://github.com/hashicorp/raft) (hashicorp/raft) | [Mozilla 2.0](https://github.com/hashicorp/raft/blob/master/LICENSE) |
|
||||
| [Raft backend implementation using BoltDB](https://github.com/hashicorp/raft-boltdb) (hashicorp/raft-boltdb) | [Mozilla 2.0](https://github.com/hashicorp/raft-boltdb/blob/master/LICENSE) |
|
||||
| [Pretty printing for Go values](https://github.com/kr/pretty) (kr/pretty) | [MIT](https://github.com/kr/pretty/blob/master/License) |
|
||||
| [Miscellaneous functions for formatting text](https://github.com/kr/text) (kr/text) | [MIT](https://github.com/kr/text/blob/main/License) |
|
||||
| [Some helpful packages for writing Go apps](https://github.com/markbates/going) (markbates/going) | [MIT](https://github.com/markbates/going/blob/master/LICENSE.txt) |
|
||||
| [Basic LDAP v3 functionality for the Go programming language](https://github.com/mark-rushakoff/ldapserver) (mark-rushakoff/ldapserver) | [BSD-3-Clause](https://github.com/markbates/going/blob/master/LICENSE) |
|
||||
| [Basic LDAP v3 functionality for the Go programming language](https://github.com/go-ldap/ldap) (go-ldap/ldap) | [MIT](https://github.com/go-ldap/ldap/blob/master/LICENSE) |
|
||||
| [ASN1 BER Encoding / Decoding Library for the GO programming language](https://github.com/go-asn1-ber/asn1-ber) (go-asn1-ber/ans1-ber) | [MIT](https://github.com/go-asn1-ber/asn1-ber/blob/master/LICENSE) |
|
||||
| [A golang registry for global request variables](https://github.com/gorilla/context) (gorilla/context) | [BSD-3-Clause](https://github.com/gorilla/context/blob/master/LICENSE) |
|
||||
| [An immutable radix tree implementation in Golang](https://github.com/hashicorp/go-immutable-radix) (hashicorp/go-immutable-radix) | [MPL-2.0](https://github.com/hashicorp/go-immutable-radix/blob/master/LICENSE) |
|
||||
| [Golang LRU cache](https://github.com/hashicorp/golang-lru) (hashicorp/golang-lru) | [MPL-2.0](https://github.com/hashicorp/golang-lru/blob/master/LICENSE) |
|
||||
| [Go Zap structured logging implementation](https://go.uber.org/zap/zapcore) | [MIT](https://github.com/uber-go/zap/blob/master/LICENSE.txt) |
|
||||
|
||||
|
||||
***Thanks to the open source community for your contributions!***
|
|
@ -0,0 +1,971 @@
|
|||
---
|
||||
title: InfluxDB Enterprise 1.9 release notes
|
||||
description: >
|
||||
Important changes and what's new in each version InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_9_ref:
|
||||
name: Release notes
|
||||
weight: 10
|
||||
parent: About the project
|
||||
---
|
||||
|
||||
## v1.9.2 [2021-06-17]
|
||||
|
||||
The release of InfluxDB Enterprise 1.9 is different from previous InfluxDB Enterprise releases
|
||||
in that there is no corresponding InfluxDB OSS release.
|
||||
(InfluxDB 1.8.x will continue to receive maintenance updates.)
|
||||
|
||||
### Features
|
||||
- Upgrade to Go 1.15.10.
|
||||
- Support user-defined *node labels*.
|
||||
Node labels let you assign arbitrary key-value pairs to meta and data nodes in a cluster.
|
||||
For instance, an operator might want to label nodes with the availability zone in which they're located.
|
||||
- Improve performance of `SHOW SERIES CARDINALITY` and `SHOW SERIES CARDINALITY from <measurement>` InfluxQL queries.
|
||||
These queries now return a `cardinality estimation` column header where before they returned `count`.
|
||||
- Improve diagnostics for license problems.
|
||||
Add [license expiration date](/enterprise_influxdb/v1.9/features/clustering-features/#entitlements) to `debug/vars` metrics.
|
||||
- Add improved [ingress metrics](/enterprise_influxdb/v1.9/administration/config-data-nodes/#ingress-metric-by-measurement-enabled--false) to track points written by measurement and by login.
|
||||
Allow for collection of statistics regarding points, values, and new series written per measurement and by login.
|
||||
This data is collected and exposed at the data node level.
|
||||
With these metrics you can, for example:
|
||||
aggregate the write requests across the entire cluster,
|
||||
monitor the growth of series within a measurement,
|
||||
and track what user credentials are being used to write data.
|
||||
- Support authentication for Kapacitor via LDAP.
|
||||
- Support for [configuring Flux query resource usage](/enterprise_influxdb/v1.9/administration/config-data-nodes/#flux-controller) (concurrency, memory, etc.).
|
||||
- Upgrade to [Flux v0.113.0](/influxdb/v2.0/reference/release-notes/flux/#v01130-2021-04-21).
|
||||
- Update Prometheus remote protocol to allow streamed reading.
|
||||
- Improve performance of sorted merge iterator.
|
||||
- Add arguments to Flux `to` function.
|
||||
- Add meancount aggregation for WindowAggregate pushdown.
|
||||
- Optimize series iteration in TSI.
|
||||
- Add `WITH KEY` to `SHOW TAG KEYS`.
|
||||
|
||||
### Bug fixes
|
||||
- `show databases` now checks read and write permissions.
|
||||
- Anti-entropy: Update `tsm1.BlockCount()` call to match signature.
|
||||
- Remove extraneous nil check from points writer.
|
||||
- Ensure a newline is printed after a successful copy during [restoration](/enterprise_influxdb/v1.9/administration/backup-and-restore/).
|
||||
- Make `entropy show` expiration times consistent with `show-shards`.
|
||||
- Properly shutdown multiple HTTP servers.
|
||||
- Allow CORS in v2 compatibility endpoints.
|
||||
- Address staticcheck warnings SA4006, ST1006, S1039, and S1020.
|
||||
- Fix Anti-Entropy looping endlessly with empty shard.
|
||||
- Disable MergeFiltersRule until it is more stable.
|
||||
- Fix data race and validation in cache ring.
|
||||
- Return error on nonexistent shard ID.
|
||||
- Add `User-Agent` to allowed CORS headers.
|
||||
- Fix variables masked by a declaration.
|
||||
- Fix key collisions when serializing `/debug/vars`.
|
||||
- Fix temporary directory search bug.
|
||||
- Grow tag index buffer if needed.
|
||||
- Use native type for summation in new meancount iterator.
|
||||
- Fix consistent error for missing shard.
|
||||
- Properly read payload in `snapshotter`.
|
||||
- Fix help text for `influx_inspect`.
|
||||
- Allow `PATCH` in CORS.
|
||||
- Fix `GROUP BY` returning multiple results per group in some circumstances.
|
||||
- Add option to authenticate Prometheus remote read.
|
||||
- Fix FGA enablement.
|
||||
- Fix "snapshot in progress" error during backup.
|
||||
- Fix cursor requests (`[start, stop]` instead of `[start, stop)`).
|
||||
- Exclude stop time from array cursors.
|
||||
- Fix Flux regression in buckets query.
|
||||
- Fix redundant registration for Prometheus collector metrics.
|
||||
- Re-add Flux CLI.
|
||||
- Use non-nil `context.Context` value in client.
|
||||
|
||||
### Other changes
|
||||
|
||||
- Remove `influx_stress` tool (deprecated since version 1.2).
|
||||
Instead, use [`inch`](https://github.com/influxdata/inch)
|
||||
or [`influx-stress`](https://github.com/influxdata/influx-stress) (not to be confused with `influx_stress`).
|
||||
|
||||
{{% note %}}
|
||||
**Note:** InfluxDB Enterprise 1.9.0 and 1.9.1 were not released.
|
||||
Bug fixes intended for 1.9.0 and 1.9.1 were rolled into InfluxDB Enterprise 1.9.2.
|
||||
{{% /note %}}
|
||||
|
||||
## v1.8.6 [2021-05-21]
|
||||
|
||||
{{% warn %}}
|
||||
**Fine-grained authorization security update.**
|
||||
If using **InfluxDB Enterprise 1.8.5**, we strongly recommend upgrading to **InfluxDB Enterprise 1.8.6** immediately.
|
||||
1.8.5 does not correctly enforce grants with specified permissions for users.
|
||||
Versions prior to InfluxDB Enterprise 1.8.5 are not affected.
|
||||
1.8.6 ensures that only users with sufficient permissions can read and write to a measurement.
|
||||
{{% /warn %}}
|
||||
|
||||
### Features
|
||||
|
||||
- **Enhanced Anti-Entropy (AE) logging**: When the [debug logging level](/enterprise_influxdb/v1.8/administration/config-data-nodes/#logging-settings) is set (`level="debug"`) in the data node configuration, the Anti-Entropy service reports reasons a shard is not idle, including:
|
||||
- active Cache compactions
|
||||
- active Level (Zero, One, Two) compactions
|
||||
- active Full compactions
|
||||
- active TSM Optimization compactions
|
||||
- cache size is nonzero
|
||||
- shard is not fully compacted
|
||||
- **Enhanced `copy-shard` logging**. Add information to log messages in `copy-shard` functions and additional error tests.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Use the proper TLS configuration when a meta node makes an remote procedure call (RPC) to a data node. Addresses RPC call issues using the following influxd-ctl commands: `copy-shard` `copy-shard-status` `kill-copy-shard` `remove-shard`
|
||||
- Previously, the Anti-Entropy service would loop trying to copy an empty shard to a data node missing that shard. Now, an empty shard is successfully created on a new node.
|
||||
- Check for previously ignored errors in `DiffIterator.Next()`. Update to check before possible function exit and ensure handles are closed on error in digest diffs.
|
||||
|
||||
## v1.8.5 [2020-04-20]
|
||||
|
||||
The InfluxDB Enterprise v1.8.5 release builds on the InfluxDB OSS v1.8.5 release.
|
||||
For details on changes incorporated from the InfluxDB OSS release, see
|
||||
[InfluxDB OSS release notes](/influxdb/v1.8/about_the_project/releasenotes-changelog/#v185-2021-04-20).
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Resolve TSM backup "snapshot in progress" error.
|
||||
- SHOW DATABASES now only shows databases that the user has either read or write access to
|
||||
- `influxd_ctl entropy show` now shows shard expiry times consistent with `influxd_ctl show-shards`
|
||||
- Add labels to the values returned in SHOW SHARDS output to clarify the node ID and TCP address.
|
||||
- Always forward repairs to the next data node (even if the current data node does not have to take action for the repair).
|
||||
|
||||
## v1.8.4 [2020-02-08]
|
||||
|
||||
The InfluxDB Enterprise 1.8.4 release builds on the InfluxDB OSS 1.8.4 release.
|
||||
For details on changes incorporated from the InfluxDB OSS release, see
|
||||
[InfluxDB OSS release notes](/influxdb/v1.8/about_the_project/releasenotes-changelog/#v1-8-4-unreleased).
|
||||
|
||||
> **Note:** InfluxDB Enterprise 1.8.3 was not released. Bug fixes intended for 1.8.3 were rolled into InfluxDB Enterprise 1.8.4.
|
||||
|
||||
### Features
|
||||
|
||||
#### Update your InfluxDB Enterprise license without restarting data nodes
|
||||
|
||||
Add the ability to [renew or update your license key or file](/enterprise_influxdb/v1.8/administration/renew-license/) without restarting data nodes.
|
||||
### Bug fixes
|
||||
|
||||
- Wrap TCP mux–based HTTP server with a function that adds custom headers.
|
||||
- Correct output for `influxd-ctl show shards`.
|
||||
- Properly encode/decode `control.Shard.Err`.
|
||||
|
||||
## v1.8.2 [2020-08-24]
|
||||
|
||||
The InfluxDB Enterprise 1.8.2 release builds on the InfluxDB OSS 1.8.2 and 1.8.1 releases.
|
||||
Due to a defect in InfluxDB OSS 1.8.1, InfluxDB Enterprise 1.8.1 was not released.
|
||||
This release resolves the defect and includes the features and bug fixes listed below.
|
||||
For details on changes incorporated from the InfluxDB OSS release, see
|
||||
[InfluxDB OSS release notes](/influxdb/v1.8/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Features
|
||||
|
||||
#### Hinted handoff improvements
|
||||
|
||||
- Allow out-of-order writes. This change adds a configuration option `allow-out-of-order-writes` to the `[cluster]` section of the data node configuration file. This setting defaults to `false` to match the existing behavior. There are some important operational considerations to review before turning this on. But, the result is enabling this option reduces the time required to drain the hinted handoff queue and increase throughput during recovery. See [allow-out-of-order-writes](/enterprise_influxdb/v1.8/administration/config-data-nodes#allow-out-of-order-false) for more detail.
|
||||
- Make the number of pending writes configurable. This change adds a configuration option in the `[hinted-handoff]` section called `max-pending-writes`, which defaults to `1024`. See [max-pending-writes](/enterprise_influxdb/v1.8/administration/config-data-nodes#max-pending-writes-1024) for more detail.
|
||||
- Update the hinted handoff queue to ensure various entries to segment files occur atomically. Prior to this change, entries were written to disk in three separate writes (len, data, offset). If the process stopped in the middle of any of those writes, the hinted handoff segment file was left in an invalid state.
|
||||
- In certain scenarios, the hinted-handoff queue would fail to drain. Upon node startup, the queue segment files are now verified and truncated if any are corrupted. Some additional logging has been added when a node starts writing to the hinted handoff queue as well.
|
||||
|
||||
#### `influxd-ctl` CLI improvements
|
||||
|
||||
- Add a verbose flag to [`influxd-ctl show-shards`](/enterprise_influxdb/v1.8/administration/cluster-commands/#show-shards). This option provides more information about each shard owner, including the state (hot/cold), last modified date and time, and size on disk.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Resolve a cluster read service issue that caused a panic. Previously, if no tags keys or values were read, the cluster read service returned a nil cursor. Now, an empty cursor is returned.
|
||||
- LDAP configuration: `GroupSearchBaseDNs`, `SearchFilter`, `GroupMembershipSearchFilter`, and `GroupSearchFilter` values in the LDAP section of the configuration file are now all escaped.
|
||||
- Eliminate orphaned, temporary directories when an error occurs during `processCreateShardSnapshotRequest()` and provide useful log information regarding the reason a temporary directory is created.
|
||||
|
||||
## v1.8 [2020-04-27]
|
||||
|
||||
The InfluxDB Enterprise 1.8 release builds on the InfluxDB OSS 1.8 release.
|
||||
For details on changes incorporated from the InfluxDB OSS release, see
|
||||
[InfluxDB OSS release notes](/influxdb/v1.8/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Features
|
||||
|
||||
#### **Back up meta data only**
|
||||
|
||||
- Add option to back up **meta data only** (users, roles, databases, continuous queries, and retention policies) using the new `-strategy` flag and `only meta` option: `influx ctl backup -strategy only meta </your-backup-directory>`.
|
||||
|
||||
> **Note:** To restore a meta data backup, use the `restore -full` command and specify your backup manifest: `influxd-ctl restore -full </backup-directory/backup.manifest>`.
|
||||
|
||||
For more information, see [Perform a metastore only backup](/enterprise_influxdb/v1.8/administration/backup-and-restore/#perform-a-metastore-only-backup).
|
||||
|
||||
#### **Incremental and full backups**
|
||||
|
||||
- Add `incremental` and `full` backup options to the new `-strategy` flag in `influx ctl backup`:
|
||||
- `influx ctl backup -strategy incremental`
|
||||
- `influx ctl backup -strategy full`
|
||||
|
||||
For more information, see the [`influxd-ctl backup` syntax](/enterprise_influxdb/v1.8/administration/backup-and-restore/#syntax).
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Update the Anti-Entropy (AE) service to ignore expired shards.
|
||||
|
||||
## v1.7.10 [2020-02-07]
|
||||
|
||||
The InfluxDB Enterprise 1.7.10 release builds on the InfluxDB OSS 1.7.10 release.
|
||||
For details on changes incorporated from the InfluxDB OSS release, see
|
||||
[InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Features
|
||||
- Log when meta state file cannot be opened.
|
||||
|
||||
### Bugfixes
|
||||
- Update `MaxShardGroupID` on meta update.
|
||||
- Don't reassign shard ownership when removing a data node.
|
||||
|
||||
## v1.7.9 [2019-10-27]
|
||||
|
||||
The InfluxDB Enterprise 1.7.9 release builds on the InfluxDB OSS 1.7.9 release.
|
||||
For details on changes incorporated from the InfluxDB OSS release, see
|
||||
[InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Release notes
|
||||
- This release is built using Go 1.12.10 which eliminates the
|
||||
[HTTP desync vulnerability](https://portswigger.net/research/http-desync-attacks-request-smuggling-reborn).
|
||||
|
||||
### Bug fixes
|
||||
- Move `tsdb store open` to beginning of server initialization.
|
||||
- Enable Meta client and Raft to use verified TLS.
|
||||
- Fix RPC pool TLS configuration.
|
||||
- Update example configuration file with new authorization options.
|
||||
|
||||
## 1.7.8 [2019-09-03]
|
||||
|
||||
{{% warn %}}
|
||||
InfluxDB now rejects all non-UTF-8 characters.
|
||||
To successfully write data to InfluxDB, use only UTF-8 characters in
|
||||
database names, measurement names, tag sets, and field sets.
|
||||
InfluxDB Enterprise customers can contact InfluxData support for more information.
|
||||
{{% /warn %}}
|
||||
|
||||
The InfluxDB Enterprise 1.7.8 release builds on the InfluxDB OSS 1.7.8 release.
|
||||
For details on changes incorporated from the InfluxDB OSS release, see [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Bug fixes
|
||||
- Clarified `influxd-ctl` error message when the Anti-Entropy (AE) service is disabled.
|
||||
- Ensure invalid, non-UTF-8 data is removed from hinted handoff.
|
||||
- Added error messages for `INFLUXDB_LOGGING_LEVEL` if misconfigured.
|
||||
- Added logging when data nodes connect to meta service.
|
||||
|
||||
### Features
|
||||
- The Flux Technical Preview has advanced to version [0.36.2](/flux/v0.36/).
|
||||
|
||||
## 1.7.7 [2019-07-12]
|
||||
|
||||
The InfluxDB Enterprise 1.7.7 release builds on the InfluxDB OSS 1.7.7 release. For details on changes incorporated from the InfluxDB OSS release, see [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Known issues
|
||||
|
||||
- The Flux Technical Preview was not advanced and remains at version 0.24.0. Next month's maintenance release will update the preview.
|
||||
- After upgrading, customers have experienced an excessively large output additional lines due to a `Println` statement introduced in this release. For a possible workaround, see https://github.com/influxdata/influxdb/issues/14265#issuecomment-508875853. Next month's maintenance release will address this issue.
|
||||
|
||||
### Features
|
||||
|
||||
- Adds TLS to RPC calls. If verifying certificates, uses the TLS setting in the configuration passed in with -config.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Ensure retry-rate-limit configuration value is used for hinted handoff.
|
||||
- Always forward AE repair to next node.
|
||||
- Improve hinted handoff metrics.
|
||||
|
||||
## 1.7.6 [2019-05-07]
|
||||
|
||||
This InfluxDB Enterprise release builds on the InfluxDB OSS 1.7.6 release. For details on changes incorporated from the InfluxDB OSS release, see [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Reverts v1.7.5 InfluxQL regressions that removed parentheses and resulted in operator precedence causing changing results in complex queries and regular expressions.
|
||||
|
||||
## 1.7.5 [2019-03-26]
|
||||
|
||||
{{% warn %}}
|
||||
|
||||
**If you are currently on this release, roll back to v1.7.4 until a fix is available.**
|
||||
|
||||
After upgrading to this release, some customers have experienced regressions,
|
||||
including parentheses being removed resulting in operator precedence causing changing results
|
||||
in complex queries and regular expressions.
|
||||
|
||||
Examples:
|
||||
|
||||
- Complex WHERE clauses with parentheses. For example, `WHERE d > 100 AND (c = 'foo' OR v = 'bar'`).
|
||||
- Conditions not including parentheses caysubg operator precedence to return `(a AND b) OR c` instead of `a AND (b OR c)`
|
||||
|
||||
{{% /warn %}}
|
||||
|
||||
This InfluxDB Enterprise release builds on the InfluxDB OSS 1.7.5 release. For details on changes incorporated from the InfluxDB OSS release, see [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Features
|
||||
|
||||
- Add `influx_tools` utility (for internal support use) to be part of the packaging.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Anti-Entropy: fix `contains no .tsm files` error.
|
||||
- `fix(cluster)`: account for nil result set when writing read response.
|
||||
|
||||
## 1.7.4 [2019-02-13]
|
||||
|
||||
This InfluxDB Enterprise release builds on the InfluxDB OSS 1.7.4 release. For details on changes incorporated from the InfluxDB OSS release, see [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Use `systemd` for Amazon Linux 2.
|
||||
|
||||
## 1.7.3 [2019-01-11]
|
||||
|
||||
This InfluxDB Enterprise release builds on the InfluxDB OSS 1.7.3 release. For details on changes incorporated from the InfluxDB OSS release, see the [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Important update [2019-02-13]
|
||||
|
||||
If you have not installed this release, then install the 1.7.4 release.
|
||||
|
||||
**If you are currently running this release, then upgrade to the 1.7.4 release as soon as possible.**
|
||||
|
||||
- A critical defect in the InfluxDB 1.7.3 release was discovered and our engineering team fixed the issue in the 1.7.4 release. Out of high concern for your data and projects, upgrade to the 1.7.4 release as soon as possible.
|
||||
- **Critical defect:** Shards larger than 16GB are at high risk for data loss during full compaction. The full compaction process runs when a shard go "cold" – no new data is being written into the database during the time range specified by the shard.
|
||||
- **Post-mortem analysis:** InfluxData engineering is performing a post-mortem analysis to determine how this defect was introduced. Their discoveries will be shared in a blog post.
|
||||
|
||||
- A small percentage of customers experienced data node crashes with segmentation violation errors. We fixed this issue in 1.7.4.
|
||||
|
||||
### Breaking changes
|
||||
- Fix invalid UTF-8 bytes preventing shard opening. Treat fields and measurements as raw bytes.
|
||||
|
||||
### Features
|
||||
|
||||
#### Anti-entropy service disabled by default
|
||||
|
||||
Prior to v.1.7.3, the anti-entropy (AE) service was enabled by default. When shards create large digests with lots of time ranges (10s of thousands), some customers experienced significant performance issues, including CPU usage spikes. If your shards include a small number of time ranges (most have 1 to 10, some have up to several hundreds) and you can benefit from the AE service, then you can enable AE and watch to see if performance is significantly impacted.
|
||||
|
||||
- Add user authentication and authorization support for Flux HTTP requests.
|
||||
- Add support for optionally logging Flux queries.
|
||||
- Add support for LDAP StartTLS.
|
||||
- Flux 0.7 support.
|
||||
- Implement TLS between data nodes.
|
||||
- Update to Flux 0.7.1.
|
||||
- Add optional TLS support to meta node Raft port.
|
||||
- Anti-Entropy: memoize `DistinctCount`, `min`, & `max` time.
|
||||
- Update influxdb dep for subquery auth update.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Update sample configuration.
|
||||
|
||||
## 1.6.6 [2019-02-28]
|
||||
-------------------
|
||||
|
||||
This release only includes the InfluxDB OSS 1.6.6 changes (no Enterprise-specific changes).
|
||||
|
||||
## 1.6.5 [2019-01-10]
|
||||
|
||||
This release builds off of the InfluxDB OSS 1.6.0 through 1.6.5 releases. For details about changes incorporated from InfluxDB OSS releases, see [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
## 1.6.4 [2018-10-23]
|
||||
|
||||
This release builds off of the InfluxDB OSS 1.6.0 through 1.6.4 releases. For details about changes incorporated from InfluxDB OSS releases, see the [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Breaking changes
|
||||
|
||||
#### Require `internal-shared-secret` if meta auth enabled
|
||||
|
||||
If `[meta] auth-enabled` is set to `true`, the `[meta] internal-shared-secret` value must be set in the configuration.
|
||||
If it is not set, an error will be logged and `influxd-meta` will not start.
|
||||
|
||||
* Previously, authentication could be enabled without setting an `internal-shared-secret`. The security risk was that an unset (empty) value could be used for the `internal-shared-secret`, seriously weakening the JWT authentication used for internode communication.
|
||||
|
||||
#### Review production installation configurations
|
||||
|
||||
The [Production Installation](/enterprise_influxdb/v1.7/production_installation/)
|
||||
documentation has been updated to fix errors in configuration settings, including changing `shared-secret` to `internal-shared-secret` and adding missing steps for configuration settings of data nodes and meta nodes. All Enterprise users should review their current configurations to ensure that the configuration settings properly enable JWT authentication for internode communication.
|
||||
|
||||
The following summarizes the expected settings for proper configuration of JWT authentication for internode communication:
|
||||
|
||||
##### Data node configuration files (`influxdb.conf`)
|
||||
|
||||
**[http] section**
|
||||
|
||||
* `auth-enabled = true`
|
||||
- Enables authentication. Default value is false.
|
||||
|
||||
**[meta] section**
|
||||
|
||||
- `meta-auth-enabled = true`
|
||||
- Must match for meta nodes' `[meta] auth-enabled` settings.
|
||||
- `meta-internal-shared-secret = "<long-pass-phrase>"`
|
||||
- Must be the same pass phrase on all meta nodes' `[meta] internal-shared-secret` settings.
|
||||
- Used by the internal API for JWT authentication. Default value is `""`.
|
||||
- A long pass phrase is recommended for stronger security.
|
||||
|
||||
##### Meta node configuration files (`meta-influxdb.conf`)
|
||||
|
||||
**[meta]** section
|
||||
|
||||
- `auth-enabled = true`
|
||||
- Enables authentication. Default value is `false` .
|
||||
- `internal-shared-secret = "<long-pass-phrase>"`
|
||||
- Must same pass phrase on all data nodes' `[meta] meta-internal-shared-secret`
|
||||
settings.
|
||||
- Used by the internal API for JWT authentication. Default value is
|
||||
`""`.
|
||||
- A long pass phrase is recommended for better security.
|
||||
|
||||
>**Note:** To provide encrypted internode communication, you must enable HTTPS. Although the JWT signature is encrypted, the the payload of a JWT token is encoded, but is not encrypted.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Only map shards that are reported ready.
|
||||
- Fix data race when shards are deleted and created concurrently.
|
||||
- Reject `influxd-ctl update-data` from one existing host to another.
|
||||
- Require `internal-shared-secret` if meta auth enabled.
|
||||
|
||||
## 1.6.2 [08-27-2018]
|
||||
|
||||
This release builds off of the InfluxDB OSS 1.6.0 through 1.6.2 releases. For details about changes incorporated from InfluxDB OSS releases, see the [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/).
|
||||
|
||||
### Features
|
||||
|
||||
- Update Go runtime to `1.10`.
|
||||
- Provide configurable TLS security options.
|
||||
- Add LDAP functionality for authorization and authentication.
|
||||
- Anti-Entropy (AE): add ability to repair shards.
|
||||
- Anti-Entropy (AE): improve swagger doc for `/status` endpoint.
|
||||
- Include the query task status in the show queries output.
|
||||
|
||||
#### Bug fixes
|
||||
|
||||
- TSM files not closed when shard is deleted.
|
||||
- Ensure shards are not queued to copy if a remote node is unavailable.
|
||||
- Ensure the hinted handoff (hh) queue makes forward progress when segment errors occur.
|
||||
- Add hinted handoff (hh) queue back pressure.
|
||||
|
||||
## 1.5.4 [2018-06-21]
|
||||
|
||||
This release builds off of the InfluxDB OSS 1.5.4 release. Please see the [InfluxDB OSS release notes](/influxdb/v1.5/about_the_project/releasenotes-changelog/) for more information about the InfluxDB OSS release.
|
||||
|
||||
## 1.5.3 [2018-05-25]
|
||||
|
||||
This release builds off of the InfluxDB OSS 1.5.3 release. Please see the [InfluxDB OSS release notes](/influxdb/v1.5/about_the_project/releasenotes-changelog/) for more information about the InfluxDB OSS release.
|
||||
|
||||
### Features
|
||||
|
||||
* Include the query task status in the show queries output.
|
||||
* Add hh writeBlocked counter.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
* Hinted-handoff: enforce max queue size per peer node.
|
||||
* TSM files not closed when shard deleted.
|
||||
|
||||
|
||||
## v1.5.2 [2018-04-12]
|
||||
|
||||
This release builds off of the InfluxDB OSS 1.5.2 release. Please see the [InfluxDB OSS release notes](/influxdb/v1.5/about_the_project/releasenotes-changelog/) for more information about the InfluxDB OSS release.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
* Running backup snapshot with client's retryWithBackoff function.
|
||||
* Ensure that conditions are encoded correctly even if the AST is not properly formed.
|
||||
|
||||
## v1.5.1 [2018-03-20]
|
||||
|
||||
This release builds off of the InfluxDB OSS 1.5.1 release. There are no Enterprise-specific changes.
|
||||
Please see the [InfluxDB OSS release notes](/influxdb/v1.7/about_the_project/releasenotes-changelog/) for more information about the InfluxDB OSS release.
|
||||
|
||||
## v1.5.0 [2018-03-06]
|
||||
|
||||
> ***Note:*** This release builds off of the 1.5 release of InfluxDB OSS. Please see the [InfluxDB OSS release
|
||||
> notes](/influxdb/v1.5/about_the_project/releasenotes-changelog/) for more information about the InfluxDB OSS release.
|
||||
|
||||
For highlights of the InfluxDB 1.5 release, see [What's new in InfluxDB 1.5](/influxdb/v1.5/about_the_project/whats_new/).
|
||||
|
||||
### Breaking changes
|
||||
|
||||
The default logging format has been changed. See [Logging and tracing in InfluxDB](/influxdb/v1.6/administration/logs/) for details.
|
||||
|
||||
### Features
|
||||
|
||||
* Add `LastModified` fields to shard RPC calls.
|
||||
* As of OSS 1.5 backup/restore interoperability is confirmed.
|
||||
* Make InfluxDB Enterprise use OSS digests.
|
||||
* Move digest to its own package.
|
||||
* Implement distributed cardinality estimation.
|
||||
* Add logging configuration to the configuration files.
|
||||
* Add AE `/repair` endpoint and update Swagger doc.
|
||||
* Update logging calls to take advantage of structured logging.
|
||||
* Use actual URL when logging anonymous stats start.
|
||||
* Fix auth failures on backup/restore.
|
||||
* Add support for passive nodes
|
||||
* Implement explain plan for remote nodes.
|
||||
* Add message pack format for query responses.
|
||||
* Teach show tag values to respect FGA
|
||||
* Address deadlock in meta server on 1.3.6
|
||||
* Add time support to `SHOW TAG VALUES`
|
||||
* Add distributed `SHOW TAG KEYS` with time support
|
||||
|
||||
### Bug fixes
|
||||
|
||||
* Fix errors occurring when policy or shard keys are missing from the manifest when limited is set to true.
|
||||
* Fix spurious `rpc error: i/o deadline exceeded` errors.
|
||||
* Elide `stream closed` error from logs and handle `io.EOF` as remote iterator interrupt.
|
||||
* Discard remote iterators that label their type as unknown.
|
||||
* Do not queue partial write errors to hinted handoff.
|
||||
* Segfault in `digest.merge`
|
||||
* Meta Node CPU pegged on idle cluster.
|
||||
* Data race on `(meta.UserInfo).acl)`
|
||||
* Fix wildcard when one shard has no data for a measurement with partial replication.
|
||||
* Add `X-Influxdb-Build` to http response headers so users can identify if a response is from an InfluxDB OSS or InfluxDB Enterprise service.
|
||||
* Ensure that permissions cannot be set on non-existent databases.
|
||||
* Switch back to using `cluster-tracing` config option to enable meta HTTP request logging.
|
||||
* `influxd-ctl restore -newdb` can't restore data.
|
||||
* Close connection for remote iterators after EOF to avoid writer hanging indefinitely.
|
||||
* Data race reading `Len()` in connection pool.
|
||||
* Use InfluxData fork of `yamux`. This update reduces overall memory usage when streaming large amounts of data.
|
||||
* Fix group by marshaling in the IteratorOptions.
|
||||
* Meta service data race.
|
||||
* Read for the interrupt signal from the stream before creating the iterators.
|
||||
* Show retention policies requires the `createdatabase` permission
|
||||
* Handle UTF files with a byte order mark when reading the configuration files.
|
||||
* Remove the pidfile after the server has exited.
|
||||
* Resend authentication credentials on redirect.
|
||||
* Updated yamux resolves race condition when SYN is successfully sent and a write timeout occurs.
|
||||
* Fix no license message.
|
||||
|
||||
## v1.3.9 [2018-01-19]
|
||||
|
||||
### Upgrading -- for users of the TSI preview
|
||||
|
||||
If you have been using the TSI preview with 1.3.6 or earlier 1.3.x releases, you will need to follow the upgrade steps to continue using the TSI preview. Unfortunately, these steps cannot be executed while the cluster is operating --
|
||||
so it will require downtime.
|
||||
|
||||
### Bugfixes
|
||||
|
||||
* Elide `stream closed` error from logs and handle `io.EOF` as remote iterator interrupt.
|
||||
* Fix spurious `rpc error: i/o deadline exceeded` errors
|
||||
* Discard remote iterators that label their type as unknown.
|
||||
* Do not queue `partial write` errors to hinted handoff.
|
||||
|
||||
## v1.3.8 [2017-12-04]
|
||||
|
||||
### Upgrading -- for users of the TSI preview
|
||||
|
||||
If you have been using the TSI preview with 1.3.6 or earlier 1.3.x releases, you will need to follow the upgrade steps to continue using the TSI preview. Unfortunately, these steps cannot be executed while the cluster is operating -- so it will require downtime.
|
||||
|
||||
### Bugfixes
|
||||
|
||||
- Updated `yamux` resolves race condition when SYN is successfully sent and a write timeout occurs.
|
||||
- Resend authentication credentials on redirect.
|
||||
- Fix wildcard when one shard has no data for a measurement with partial replication.
|
||||
- Fix spurious `rpc error: i/o deadline exceeded` errors.
|
||||
|
||||
## v1.3.7 [2017-10-26]
|
||||
|
||||
### Upgrading -- for users of the TSI preview
|
||||
|
||||
The 1.3.7 release resolves a defect that created duplicate tag values in TSI indexes See Issues
|
||||
[#8995](https://github.com/influxdata/influxdb/pull/8995), and [#8998](https://github.com/influxdata/influxdb/pull/8998).
|
||||
However, upgrading to 1.3.7 cause compactions to fail, see [Issue #9025](https://github.com/influxdata/influxdb/issues/9025).
|
||||
We will provide a utility that will allow TSI indexes to be rebuilt,
|
||||
resolving the corruption possible in releases prior to 1.3.7. If you are using the TSI preview,
|
||||
**you should not upgrade to 1.3.7 until this utility is available**.
|
||||
We will update this release note with operational steps once the utility is available.
|
||||
|
||||
#### Bugfixes
|
||||
|
||||
- Read for the interrupt signal from the stream before creating the iterators.
|
||||
- Address Deadlock issue in meta server on 1.3.6
|
||||
- Fix logger panic associated with anti-entropy service and manually removed shards.
|
||||
|
||||
## v1.3.6 [2017-09-28]
|
||||
|
||||
### Bugfixes
|
||||
|
||||
- Fix "group by" marshaling in the IteratorOptions.
|
||||
- Address meta service data race condition.
|
||||
- Fix race condition when writing points to remote nodes.
|
||||
- Use InfluxData fork of yamux. This update reduces overall memory usage when streaming large amounts of data.
|
||||
Contributed back to the yamux project via: https://github.com/hashicorp/yamux/pull/50
|
||||
- Address data race reading Len() in connection pool.
|
||||
|
||||
## v1.3.5 [2017-08-29]
|
||||
|
||||
This release builds off of the 1.3.5 release of OSS InfluxDB.
|
||||
Please see the OSS [release notes](/influxdb/v1.3/about_the_project/releasenotes-changelog/#v1-3-5-2017-08-29) for more information about the OSS releases.
|
||||
|
||||
## v1.3.4 [2017-08-23]
|
||||
|
||||
This release builds off of the 1.3.4 release of OSS InfluxDB. Please see the [OSS release notes](/influxdb/v1.3/about_the_project/releasenotes-changelog/) for more information about the OSS releases.
|
||||
|
||||
### Bugfixes
|
||||
|
||||
- Close connection for remote iterators after EOF to avoid writer hanging indefinitely
|
||||
|
||||
## v1.3.3 [2017-08-10]
|
||||
|
||||
This release builds off of the 1.3.3 release of OSS InfluxDB. Please see the [OSS release notes](/influxdb/v1.3/about_the_project/releasenotes-changelog/) for more information about the OSS releases.
|
||||
|
||||
### Bugfixes
|
||||
|
||||
- Connections are not closed when `CreateRemoteIterator` RPC returns no iterators, resolved memory leak
|
||||
|
||||
## v1.3.2 [2017-08-04]
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- `influxd-ctl restore -newdb` unable to restore data.
|
||||
- Improve performance of `SHOW TAG VALUES`.
|
||||
- Show a subset of config settings in `SHOW DIAGNOSTICS`.
|
||||
- Switch back to using cluster-tracing config option to enable meta HTTP request logging.
|
||||
- Fix remove-data error.
|
||||
|
||||
## v1.3.1 [2017-07-20]
|
||||
|
||||
#### Bug fixes
|
||||
|
||||
- Show a subset of config settings in SHOW DIAGNOSTICS.
|
||||
- Switch back to using cluster-tracing config option to enable meta HTTP request logging.
|
||||
- Fix remove-data error.
|
||||
|
||||
## v1.3.0 [2017-06-21]
|
||||
|
||||
### Configuration Changes
|
||||
|
||||
#### `[cluster]` Section
|
||||
|
||||
* `max-remote-write-connections` is deprecated and can be removed.
|
||||
* NEW: `pool-max-idle-streams` and `pool-max-idle-time` configure the RPC connection pool.
|
||||
See `config.sample.toml` for descriptions of these new options.
|
||||
|
||||
### Removals
|
||||
|
||||
The admin UI is removed and unusable in this release. The `[admin]` configuration section will be ignored.
|
||||
|
||||
#### Features
|
||||
|
||||
- Allow non-admin users to execute SHOW DATABASES
|
||||
- Add default config path search for influxd-meta.
|
||||
- Reduce cost of admin user check for clusters with large numbers of users.
|
||||
- Store HH segments by node and shard
|
||||
- Remove references to the admin console.
|
||||
- Refactor RPC connection pool to multiplex multiple streams over single connection.
|
||||
- Report RPC connection pool statistics.
|
||||
|
||||
#### Bug fixes
|
||||
|
||||
- Fix security escalation bug in subscription management.
|
||||
- Certain permissions should not be allowed at the database context.
|
||||
- Make the time in `influxd-ctl`'s `copy-shard-status` argument human readable.
|
||||
- Fix `influxd-ctl remove-data -force`.
|
||||
- Ensure replaced data node correctly joins meta cluster.
|
||||
- Delay metadata restriction on restore.
|
||||
- Writing points outside of retention policy does not return error
|
||||
- Decrement internal database's replication factor when a node is removed.
|
||||
|
||||
## v1.2.5 [2017-05-16]
|
||||
|
||||
This release builds off of the 1.2.4 release of OSS InfluxDB.
|
||||
Please see the OSS [release notes](/influxdb/v1.3/about_the_project/releasenotes-changelog/#v1-2-4-2017-05-08) for more information about the OSS releases.
|
||||
|
||||
#### Bug fixes
|
||||
|
||||
- Fix issue where the [`ALTER RETENTION POLICY` query](/influxdb/v1.3/query_language/database_management/#modify-retention-policies-with-alter-retention-policy) does not update the default retention policy.
|
||||
- Hinted-handoff: remote write errors containing `partial write` are considered droppable.
|
||||
- Fix the broken `influxd-ctl remove-data -force` command.
|
||||
- Fix security escalation bug in subscription management.
|
||||
- Prevent certain user permissions from having a database-specific scope.
|
||||
- Reduce the cost of the admin user check for clusters with large numbers of users.
|
||||
- Fix hinted-handoff remote write batching.
|
||||
|
||||
## v1.2.2 [2017-03-15]
|
||||
|
||||
This release builds off of the 1.2.1 release of OSS InfluxDB.
|
||||
Please see the OSS [release notes](https://github.com/influxdata/influxdb/blob/1.2/CHANGELOG.md#v121-2017-03-08) for more information about the OSS release.
|
||||
|
||||
### Configuration Changes
|
||||
|
||||
The following configuration changes may need to changed before [upgrading](/enterprise_influxdb/v1.3/administration/upgrading/) to 1.2.2 from prior versions.
|
||||
|
||||
#### shard-writer-timeout
|
||||
|
||||
We've removed the data node's `shard-writer-timeout` configuration option from the `[cluster]` section.
|
||||
As of version 1.2.2, the system sets `shard-writer-timeout` internally.
|
||||
The configuration option can be removed from the [data node configuration file](/enterprise_influxdb/v1.3/administration/configuration/#data-node-configuration).
|
||||
|
||||
#### retention-autocreate
|
||||
|
||||
In versions 1.2.0 and 1.2.1, the `retention-autocreate` setting appears in both the meta node and data node configuration files.
|
||||
To disable retention policy auto-creation, users on version 1.2.0 and 1.2.1 must set `retention-autocreate` to `false` in both the meta node and data node configuration files.
|
||||
|
||||
In version 1.2.2, we’ve removed the `retention-autocreate` setting from the data node configuration file.
|
||||
As of version 1.2.2, users may remove `retention-autocreate` from the data node configuration file.
|
||||
To disable retention policy auto-creation, set `retention-autocreate` to `false` in the meta node configuration file only.
|
||||
|
||||
This change only affects users who have disabled the `retention-autocreate` option and have installed version 1.2.0 or 1.2.1.
|
||||
|
||||
#### Bug fixes
|
||||
|
||||
##### Backup and Restore
|
||||
<br>
|
||||
|
||||
- Prevent the `shard not found` error by making [backups](/enterprise_influxdb/v1.3/guides/backup-and-restore/#backup) skip empty shards
|
||||
- Prevent the `shard not found` error by making [restore](/enterprise_influxdb/v1.3/guides/backup-and-restore/#restore) handle empty shards
|
||||
- Ensure that restores from an incremental backup correctly handle file paths
|
||||
- Allow incremental backups with restrictions (for example, they use the `-db` or `rp` flags) to be stores in the same directory
|
||||
- Support restores on meta nodes that are not the raft leader
|
||||
|
||||
##### Hinted handoff
|
||||
<br>
|
||||
|
||||
- Fix issue where dropped writes were not recorded when the [hinted handoff](/enterprise_influxdb/v1.3/concepts/clustering/#hinted-handoff) queue reached the maximum size
|
||||
- Prevent the hinted handoff from becoming blocked if it encounters field type errors
|
||||
|
||||
##### Other
|
||||
<br>
|
||||
|
||||
- Return partial results for the [`SHOW TAG VALUES` query](/influxdb/v1.3/query_language/schema_exploration/#show-tag-values) even if the cluster includes an unreachable data node
|
||||
- Return partial results for the [`SHOW MEASUREMENTS` query](/influxdb/v1.3/query_language/schema_exploration/#show-measurements) even if the cluster includes an unreachable data node
|
||||
- Prevent a panic when the system files to process points
|
||||
- Ensure that cluster hostnames can be case insensitive
|
||||
- Update the `retryCAS` code to wait for a newer snapshot before retrying
|
||||
- Serialize access to the meta client and meta store to prevent raft log buildup
|
||||
- Remove sysvinit package dependency for RPM packages
|
||||
- Make the default retention policy creation an atomic process instead of a two-step process
|
||||
- Prevent `influxd-ctl`'s [`join` argument](/enterprise_influxdb/v1.3/features/cluster-commands/#join) from completing a join when the command also specifies the help flag (`-h`)
|
||||
- Fix the `influxd-ctl`'s [force removal](/enterprise_influxdb/v1.3/features/cluster-commands/#remove-meta) of meta nodes
|
||||
- Update the meta node and data node sample configuration files
|
||||
|
||||
## v1.2.1 [2017-01-25]
|
||||
|
||||
#### Cluster-specific Bugfixes
|
||||
|
||||
- Fix panic: Slice bounds out of range
|
||||
 Fix how the system removes expired shards.
|
||||
- Remove misplaced newlines from cluster logs
|
||||
|
||||
## v1.2.0 [2017-01-24]
|
||||
|
||||
This release builds off of the 1.2.0 release of OSS InfluxDB.
|
||||
Please see the OSS [release notes](https://github.com/influxdata/influxdb/blob/1.2/CHANGELOG.md#v120-2017-01-24) for more information about the OSS release.
|
||||
|
||||
### Upgrading
|
||||
|
||||
* The `retention-autocreate` configuration option has moved from the meta node configuration file to the [data node configuration file](/enterprise_influxdb/v1.3/administration/configuration/#retention-autocreate-true).
|
||||
To disable the auto-creation of retention policies, set `retention-autocreate` to `false` in your data node configuration files.
|
||||
* The previously deprecated `influxd-ctl force-leave` command has been removed. The replacement command to remove a meta node which is never coming back online is [`influxd-ctl remove-meta -force`](/enterprise_influxdb/v1.3/features/cluster-commands/).
|
||||
|
||||
#### Cluster-specific Features
|
||||
|
||||
- Improve the meta store: any meta store changes are done via a compare and swap
|
||||
- Add support for [incremental backups](/enterprise_influxdb/v1.3/guides/backup-and-restore/)
|
||||
- Automatically remove any deleted shard groups from the data store
|
||||
- Uncomment the section headers in the default [configuration file](/enterprise_influxdb/v1.3/administration/configuration/)
|
||||
- Add InfluxQL support for [subqueries](/influxdb/v1.3/query_language/data_exploration/#subqueries)
|
||||
|
||||
#### Cluster-specific Bugfixes
|
||||
|
||||
- Update dependencies with Godeps
|
||||
- Fix a data race in meta client
|
||||
- Ensure that the system removes the relevant [user permissions and roles](/enterprise_influxdb/v1.3/features/users/) when a database is dropped
|
||||
- Fix a couple typos in demo [configuration file](/enterprise_influxdb/v1.3/administration/configuration/)
|
||||
- Make optional the version protobuf field for the meta store
|
||||
- Remove the override of GOMAXPROCS
|
||||
- Remove an unused configuration option (`dir`) from the backend
|
||||
- Fix a panic around processing remote writes
|
||||
- Return an error if a remote write has a field conflict
|
||||
- Drop points in the hinted handoff that (1) have field conflict errors (2) have [`max-values-per-tag`](/influxdb/v1.3/administration/config/#max-values-per-tag-100000) errors
|
||||
- Remove the deprecated `influxd-ctl force-leave` command
|
||||
- Fix issue where CQs would stop running if the first meta node in the cluster stops
|
||||
- Fix logging in the meta httpd handler service
|
||||
- Fix issue where subscriptions send duplicate data for [Continuous Query](/influxdb/v1.3/query_language/continuous_queries/) results
|
||||
- Fix the output for `influxd-ctl show-shards`
|
||||
- Send the correct RPC response for `ExecuteStatementRequestMessage`
|
||||
|
||||
## v1.1.5 [2017-04-28]
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Prevent certain user permissions from having a database-specific scope.
|
||||
- Fix security escalation bug in subscription management.
|
||||
|
||||
## v1.1.3 [2017-02-27]
|
||||
|
||||
This release incorporates the changes in the 1.1.4 release of OSS InfluxDB.
|
||||
Please see the OSS [changelog](https://github.com/influxdata/influxdb/blob/v1.1.4/CHANGELOG.md) for more information about the OSS release.
|
||||
|
||||
### Bug fixes
|
||||
|
||||
- Delay when a node listens for network connections until after all requisite services are running. This prevents queries to the cluster from failing unnecessarily.
|
||||
- Allow users to set the `GOMAXPROCS` environment variable.
|
||||
|
||||
## v1.1.2 [internal]
|
||||
|
||||
This release was an internal release only.
|
||||
It incorporates the changes in the 1.1.3 release of OSS InfluxDB.
|
||||
Please see the OSS [changelog](https://github.com/influxdata/influxdb/blob/v1.1.3/CHANGELOG.md) for more information about the OSS release.
|
||||
|
||||
## v1.1.1 [2016-12-06]
|
||||
|
||||
This release builds off of the 1.1.1 release of OSS InfluxDB.
|
||||
Please see the OSS [release notes](https://github.com/influxdata/influxdb/blob/1.1/CHANGELOG.md#v111-2016-12-06) for more information about the OSS release.
|
||||
|
||||
This release is built with Go (golang) 1.7.4.
|
||||
It resolves a security vulnerability reported in Go (golang) version 1.7.3 which impacts all
|
||||
users currently running on the macOS platform, powered by the Darwin operating system.
|
||||
|
||||
#### Cluster-specific bug fixes
|
||||
|
||||
- Fix hinted-handoff issue: Fix record size larger than max size
|
||||
 If a Hinted Handoff write appended a block that was larger than the maximum file size, the queue would get stuck because the maximum size was not updated. When reading the block back out during processing, the system would return an error because the block size was larger than the file size -- which indicates a corrupted block.
|
||||
|
||||
## v1.1.0 [2016-11-14]
|
||||
|
||||
This release builds off of the 1.1.0 release of InfluxDB OSS.
|
||||
Please see the OSS [release notes](https://github.com/influxdata/influxdb/blob/1.1/CHANGELOG.md#v110-2016-11-14) for more information about the OSS release.
|
||||
|
||||
### Upgrading
|
||||
|
||||
* The 1.1.0 release of OSS InfluxDB has some important [configuration changes](https://github.com/influxdata/influxdb/blob/1.1/CHANGELOG.md#configuration-changes) that may affect existing clusters.
|
||||
* The `influxd-ctl join` command has been renamed to `influxd-ctl add-meta`. If you have existing scripts that use `influxd-ctl join`, they will need to use `influxd-ctl add-meta` or be updated to use the new cluster setup command.
|
||||
|
||||
#### Cluster setup
|
||||
|
||||
The `influxd-ctl join` command has been changed to simplify cluster setups. To join a node to a cluster, you can run `influxd-ctl join <meta:8091>`, and we will attempt to detect and add any meta or data node process running on the hosts automatically. The previous `join` command exists as `add-meta` now. If it's the first node of a cluster, the meta address argument is optional.
|
||||
|
||||
#### Logging
|
||||
|
||||
Switches to journald logging for on systemd systems. Logs are no longer sent to `/var/log/influxdb` on systemd systems.
|
||||
|
||||
#### Cluster-specific features
|
||||
|
||||
- Add a configuration option for setting gossiping frequency on data nodes
|
||||
- Allow for detailed insight into the Hinted Handoff queue size by adding `queueBytes` to the hh\_processor statistics
|
||||
- Add authentication to the meta service API
|
||||
- Update Go (golang) dependencies: Fix Go Vet and update circle Go Vet command
|
||||
- Simplify the process for joining nodes to a cluster
|
||||
- Include the node's version number in the `influxd-ctl show` output
|
||||
- Return and error if there are additional arguments after `influxd-ctl show`
|
||||
 Fixes any confusion between the correct command for showing detailed shard information (`influxd-ctl show-shards`) and the incorrect command (`influxd-ctl show shards`)
|
||||
|
||||
#### Cluster-specific bug fixes
|
||||
|
||||
- Return an error if getting latest snapshot takes longer than 30 seconds
|
||||
- Remove any expired shards from the `/show-shards` output
|
||||
- Respect the [`pprof-enabled` configuration setting](/enterprise_influxdb/v1.3/administration/configuration/#pprof-enabled-true) and enable it by default on meta nodes
|
||||
- Respect the [`pprof-enabled` configuration setting](/enterprise_influxdb/v1.3/administration/configuration/#pprof-enabled-true-1) on data nodes
|
||||
- Use the data reference instead of `Clone()` during read-only operations for performance purposes
|
||||
- Prevent the system from double-collecting cluster statistics
|
||||
- Ensure that the Meta API redirects to the cluster leader when it gets the `ErrNotLeader` error
|
||||
- Don't overwrite cluster users with existing OSS InfluxDB users when migrating an OSS instance into a cluster
|
||||
- Fix a data race in the raft store
|
||||
- Allow large segment files (> 10MB) in the Hinted Handoff
|
||||
- Prevent `copy-shard` from retrying if the `copy-shard` command was killed
|
||||
- Prevent a hanging `influxd-ctl add-data` command by making data nodes check for meta nodes before they join a cluster
|
||||
|
||||
## v1.0.4 [2016-10-19]
|
||||
|
||||
#### Cluster-specific bug fixes
|
||||
|
||||
- Respect the [Hinted Handoff settings](/enterprise_influxdb/v1.3/administration/configuration/#hinted-handoff) in the configuration file
|
||||
- Fix expanding regular expressions when all shards do not exist on node that's handling the request
|
||||
|
||||
## v1.0.3 [2016-10-07]
|
||||
|
||||
#### Cluster-specific bug fixes
|
||||
|
||||
- Fix a panic in the Hinted Handoff: `lastModified`
|
||||
|
||||
## v1.0.2 [2016-10-06]
|
||||
|
||||
This release builds off of the 1.0.2 release of OSS InfluxDB. Please see the OSS [release notes](https://github.com/influxdata/influxdb/blob/1.0/CHANGELOG.md#v102-2016-10-05) for more information about the OSS release.
|
||||
|
||||
#### Cluster-specific bug fixes
|
||||
|
||||
- Prevent double read-lock in the meta client
|
||||
- Fix a panic around a corrupt block in Hinted Handoff
|
||||
- Fix issue where `systemctl enable` would throw an error if the symlink already exists
|
||||
|
||||
## v1.0.1 [2016-09-28]
|
||||
|
||||
This release builds off of the 1.0.1 release of OSS InfluxDB.
|
||||
Please see the OSS [release notes](https://github.com/influxdata/influxdb/blob/1.0/CHANGELOG.md#v101-2016-09-26)
|
||||
for more information about the OSS release.
|
||||
|
||||
#### Cluster-specific bug fixes
|
||||
|
||||
* Balance shards correctly with a restore
|
||||
* Fix a panic in the Hinted Handoff: `runtime error: invalid memory address or nil pointer dereference`
|
||||
* Ensure meta node redirects to leader when removing data node
|
||||
* Fix a panic in the Hinted Handoff: `runtime error: makeslice: len out of range`
|
||||
* Update the data node configuration file so that only the minimum configuration options are uncommented
|
||||
|
||||
## v1.0.0 [2016-09-07]
|
||||
|
||||
This release builds off of the 1.0.0 release of OSS InfluxDB.
|
||||
Please see the OSS [release notes](https://github.com/influxdata/influxdb/blob/1.0/CHANGELOG.md#v100-2016-09-07) for more information about the OSS release.
|
||||
|
||||
Breaking Changes:
|
||||
|
||||
* The keywords `IF`, `EXISTS`, and `NOT` were removed for this release. This means you no longer need to specify `IF NOT EXISTS` for `DROP DATABASE` or `IF EXISTS` for `CREATE DATABASE`. Using these keywords will return a query error.
|
||||
* `max-series-per-database` was added with a default of 1M but can be disabled by setting it to `0`. Existing databases with series that exceed this limit will continue to load, but writes that would create new series will fail.
|
||||
|
||||
### Hinted handoff
|
||||
|
||||
A number of changes to hinted handoff are included in this release:
|
||||
|
||||
* Truncating only the corrupt block in a corrupted segment to minimize data loss
|
||||
* Immediately queue writes in hinted handoff if there are still writes pending to prevent inconsistencies in shards
|
||||
* Remove hinted handoff queues when data nodes are removed to eliminate manual cleanup tasks
|
||||
|
||||
### Performance
|
||||
|
||||
* `SHOW MEASUREMENTS` and `SHOW TAG VALUES` have been optimized to work better for multiple nodes and shards
|
||||
* `DROP` and `DELETE` statements run in parallel and more efficiently and should not leave the system in an inconsistent state
|
||||
|
||||
### Security
|
||||
|
||||
The Cluster API used by `influxd-ctl` can not be protected with SSL certs.
|
||||
|
||||
### Cluster management
|
||||
|
||||
Data nodes that can no longer be restarted can now be forcefully removed from the cluster using `influxd-ctl remove-data -force <addr>`. This should only be run if a grace removal is not possible.
|
||||
|
||||
Backup and restore has been updated to fix issues and refine existing capabilities.
|
||||
|
||||
#### Cluster-specific features
|
||||
|
||||
- Add the Users method to control client
|
||||
- Add a `-force` option to the `influxd-ctl remove-data` command
|
||||
- Disable the logging of `stats` service queries
|
||||
- Optimize the `SHOW MEASUREMENTS` and `SHOW TAG VALUES` queries
|
||||
- Update the Go (golang) package library dependencies
|
||||
- Minimize the amount of data-loss in a corrupted Hinted Handoff file by truncating only the last corrupted segment instead of the entire file
|
||||
- Log a write error when the Hinted Handoff queue is full for a node
|
||||
- Remove Hinted Handoff queues on data nodes when the target data nodes are removed from the cluster
|
||||
- Add unit testing around restore in the meta store
|
||||
- Add full TLS support to the cluster API, including the use of self-signed certificates
|
||||
- Improve backup/restore to allow for partial restores to a different cluster or to a database with a different database name
|
||||
- Update the shard group creation logic to be balanced
|
||||
- Keep raft log to a minimum to prevent replaying large raft logs on startup
|
||||
|
||||
#### Cluster-specific bug fixes
|
||||
|
||||
- Remove bad connections from the meta executor connection pool
|
||||
- Fix a panic in the meta store
|
||||
- Fix a panic caused when a shard group is not found
|
||||
- Fix a corrupted Hinted Handoff
|
||||
- Ensure that any imported OSS admin users have all privileges in the cluster
|
||||
- Ensure that `max-select-series` is respected
|
||||
- Handle the `peer already known` error
|
||||
- Fix Hinted handoff panic around segment size check
|
||||
- Drop Hinted Handoff writes if they contain field type inconsistencies
|
||||
|
||||
<br>
|
||||
# Web Console
|
||||
|
||||
## DEPRECATED: Enterprise Web Console
|
||||
|
||||
The Enterprise Web Console has officially been deprecated and will be eliminated entirely by the end of 2017.
|
||||
No additional features will be added and no additional bug fix releases are planned.
|
||||
|
||||
For browser-based access to InfluxDB Enterprise, [Chronograf](/{{< latest "chronograf" >}}/introduction) is now the recommended tool to use.
|
|
@ -0,0 +1,47 @@
|
|||
---
|
||||
title: Third party software
|
||||
menu:
|
||||
enterprise_influxdb_1_9_ref:
|
||||
name: Third party software
|
||||
weight: 20
|
||||
parent: About the project
|
||||
---
|
||||
|
||||
InfluxData products contain third party software, which means the copyrighted,
|
||||
patented, or otherwise legally protected software of third parties that is
|
||||
incorporated in InfluxData products.
|
||||
|
||||
Third party suppliers make no representation nor warranty with respect to
|
||||
such third party software or any portion thereof.
|
||||
Third party suppliers assume no liability for any claim that might arise with
|
||||
respect to such third party software, nor for a
|
||||
customer’s use of or inability to use the third party software.
|
||||
|
||||
InfluxDB Enterprise 1.9 includes the following third party software components, which are maintained on a version by version basis.
|
||||
|
||||
| Component | License | Integration |
|
||||
| :-------- | :-------- | :-------- |
|
||||
| [asn1-ber](https://github.com/go-asn1-ber/asn1-ber) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [cobra](https://github.com/spf13/cobra) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |Statically linked|
|
||||
| [context](https://github.com/gorilla/context)| [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
|
||||
| [flatbuffers](https://github.com/google/flatbuffers) | [Apache License 2.0](https://opensource.org/licenses/Apache-2.0) | Statically linked |
|
||||
| [flux](https://github.com/influxdata/flux) | [Apache License 2.0](https://opensource.org/licenses/Apache-2.0) | Statically linked |Statically linked|
|
||||
| [goconvey](https://github.com/glycerine/goconvey) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [go-immutable-radix](https://github.com/hashicorp/go-immutable-radixhttps://github.com/hashicorp/go-immutable-radix) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [going](https://github.com/markbates/going) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|
||||
| [golang-lru](https://github.com/hashicorp/golang-lru) |[Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [go-msgpack](https://github.com/hashicorp/go-msgpack) | [BSD 3-Clause](https://opensource.org/licenses/BSD-3-Clause) | Statically linked |
|
||||
| [go-metrics](https://github.com/armon/go-metrics) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [go-uuid](https://github.com/hashicorp/go-uuid) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |
|
||||
| [handlers](https://github.com/gorilla/handlershttps://github.com/gorilla/handlers) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
|
||||
| [jose2go](https://github.com/dvsekhvalnov/jose2go) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [ldap](https://github.com/go-ldap/ldap) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [ldapserver](https://github.com/mark-rushakoff/ldapserverhttps://github.com/mark-rushakoff/ldapserver) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|
||||
| [mux](https://github.com/gorilla/mux) | [BSD 2-Clause](https://opensource.org/licenses/BSD-2-Clause) | Statically linked |
|
||||
| [pkcs7](https://github.com/fullsailor/pkcs7) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |
|
||||
| [pretty](https://github.com/kr/pretty) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|
||||
| [raft](https://github.com/hashicorp/raft) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |Statically linked|
|
||||
| [raft-boltdb](https://github.com/hashicorp/raft-boltdb) | [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |Statically linked|
|
||||
| [sqlx](https://github.com/jmoiron/sqlxhttps://github.com/jmoiron/sqlx) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|
||||
| [text](https://github.com/kr/text) | [MIT](https://opensource.org/licenses/MIT) | Statically linked |Statically linked|
|
||||
| [yamux](https://github.com/hashicorp/yamux/)| [Mozilla Public License 2.0](https://opensource.org/licenses/MPL-2.0) | Statically linked |Statically linked|
|
|
@ -0,0 +1,10 @@
|
|||
---
|
||||
title: Administer InfluxDB Enterprise
|
||||
description: Configuration, security, and logging in InfluxDB enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Administration
|
||||
weight: 70
|
||||
---
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,348 @@
|
|||
---
|
||||
title: Use Anti-Entropy service in InfluxDB Enterprise
|
||||
description: The Anti-Entropy service monitors and repairs shards in InfluxDB.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/guides/Anti-Entropy/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Use Anti-entropy service
|
||||
weight: 60
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
{{% warn %}}
|
||||
Prior to InfluxDB Enterprise 1.7.2, the Anti-Entropy (AE) service was enabled by default. When shards create digests with lots of time ranges (10s of thousands), some customers have experienced significant performance issues, including CPU usage spikes. If your shards include a small number of time ranges (most have 1 to 10, some have up to several hundreds) and you can benefit from the AE service, enable AE and monitor it closely to see if your performance is adversely impacted.
|
||||
{{% /warn %}}
|
||||
|
||||
## Introduction
|
||||
|
||||
Shard entropy refers to inconsistency among shards in a shard group.
|
||||
This can be due to the "eventually consistent" nature of data stored in InfluxDB
|
||||
Enterprise clusters or due to missing or unreachable shards.
|
||||
The Anti-Entropy (AE) service ensures that each data node has all the shards it
|
||||
owns according to the metastore and that all shards in a shard group are consistent.
|
||||
Missing shards are automatically repaired without operator intervention while
|
||||
out-of-sync shards can be manually queued for repair.
|
||||
This topic covers how the Anti-Entropy service works and some of the basic situations where it takes effect.
|
||||
|
||||
## Concepts
|
||||
|
||||
The Anti-Entropy service is a component of the `influxd` service available on each of your data nodes. Use this service to ensure that each data node has all of the shards that the metastore says it owns and ensure all shards in a shard group are in sync.
|
||||
If any shards are missing, the Anti-Entropy service will copy existing shards from other shard owners.
|
||||
If data inconsistencies are detected among shards in a shard group, [invoke the Anti-Entropy service](#command-line-tools-for-managing-entropy) and queue the out-of-sync shards for repair.
|
||||
In the repair process, the Anti-Entropy service will sync the necessary updates from other shards
|
||||
within a shard group.
|
||||
|
||||
By default, the service performs consistency checks every 5 minutes. This interval can be modified in the [`anti-entropy.check-interval`](/enterprise_influxdb/v1.9/administration/config-data-nodes/#check-interval-5m) configuration setting.
|
||||
|
||||
The Anti-Entropy service can only address missing or inconsistent shards when
|
||||
there is at least one copy of the shard available.
|
||||
In other words, as long as new and healthy nodes are introduced, a replication
|
||||
factor of 2 can recover from one missing or inconsistent node;
|
||||
a replication factor of 3 can recover from two missing or inconsistent nodes, and so on.
|
||||
A replication factor of 1, which is not recommended, cannot be recovered by the Anti-Entropy service.
|
||||
|
||||
## Symptoms of entropy
|
||||
|
||||
The Anti-Entropy service automatically detects and fixes missing shards, but shard inconsistencies
|
||||
must be [manually detected and queued for repair](#detecting-and-repairing-entropy).
|
||||
There are symptoms of entropy that, if seen, would indicate an entropy repair is necessary.
|
||||
|
||||
### Different results for the same query
|
||||
|
||||
When running queries against an InfluxDB Enterprise cluster, each query may be routed to a different data node.
|
||||
If entropy affects data within the queried range, the same query will return different
|
||||
results depending on which node the query runs against.
|
||||
|
||||
_**Query attempt 1**_
|
||||
|
||||
```sql
|
||||
SELECT mean("usage_idle") WHERE time > '2018-06-06T18:00:00Z' AND time < '2018-06-06T18:15:00Z' GROUP BY time(3m) FILL(0)
|
||||
|
||||
name: cpu
|
||||
time mean
|
||||
---- ----
|
||||
1528308000000000000 99.11867392974537
|
||||
1528308180000000000 99.15410822137049
|
||||
1528308360000000000 99.14927494363032
|
||||
1528308540000000000 99.1980535465783
|
||||
1528308720000000000 99.18584290492262
|
||||
```
|
||||
|
||||
_**Query attempt 2**_
|
||||
|
||||
```sql
|
||||
SELECT mean("usage_idle") WHERE time > '2018-06-06T18:00:00Z' AND time < '2018-06-06T18:15:00Z' GROUP BY time(3m) FILL(0)
|
||||
|
||||
name: cpu
|
||||
time mean
|
||||
---- ----
|
||||
1528308000000000000 99.11867392974537
|
||||
1528308180000000000 0
|
||||
1528308360000000000 0
|
||||
1528308540000000000 0
|
||||
1528308720000000000 99.18584290492262
|
||||
```
|
||||
|
||||
The results indicate that data is missing in the queried time range and entropy is present.
|
||||
|
||||
### Flapping dashboards
|
||||
|
||||
A "flapping" dashboard means data visualizations change when data is refreshed
|
||||
and pulled from a node with entropy (inconsistent data).
|
||||
It is the visual manifestation of getting [different results from the same query](#different-results-for-the-same-query).
|
||||
|
||||
<img src="/img/enterprise/1-6-flapping-dashboard.gif" alt="Flapping dashboard" style="width:100%; max-width:800px">
|
||||
|
||||
## Technical details
|
||||
|
||||
### Detecting entropy
|
||||
|
||||
The Anti-Entropy service runs on each data node and periodically checks its shards' statuses
|
||||
relative to the next data node in the ownership list.
|
||||
The service creates a "digest" or summary of data in the shards on the node.
|
||||
|
||||
For example, assume there are two data nodes in your cluster: `node1` and `node2`.
|
||||
Both `node1` and `node2` own `shard1` so `shard1` is replicated across each.
|
||||
|
||||
When a status check runs, `node1` will ask `node2` when `shard1` was last modified.
|
||||
If the reported modification time differs from the previous check, then
|
||||
`node1` asks `node2` for a new digest of `shard1`, checks for differences (performs a "diff") between the `shard1` digest for `node2` and the local `shard1` digest.
|
||||
If a difference exists, `shard1` is flagged as having entropy.
|
||||
|
||||
### Repairing entropy
|
||||
|
||||
If during a status check a node determines the next node is completely missing a shard,
|
||||
it immediately adds the missing shard to the repair queue.
|
||||
A background routine monitors the queue and begins the repair process as new shards are added to it.
|
||||
Repair requests are pulled from the queue by the background process and repaired using a `copy shard` operation.
|
||||
|
||||
> Currently, shards that are present on both nodes but contain different data are not automatically queued for repair.
|
||||
> A user must make the request via `influxd-ctl entropy repair <shard ID>`.
|
||||
> For more information, see [Detecting and repairing entropy](#detecting-and-repairing-entropy) below.
|
||||
|
||||
Using `node1` and `node2` from the [earlier example](#detecting-entropy), `node1` asks `node2` for a digest of `shard1`.
|
||||
`node1` diffs its own local `shard1` digest and `node2`'s `shard1` digest,
|
||||
then creates a new digest containing only the differences (the diff digest).
|
||||
The diff digest is used to create a patch containing only the data `node2` is missing.
|
||||
`node1` sends the patch to `node2` and instructs it to apply it.
|
||||
Once `node2` finishes applying the patch, it queues a repair for `shard1` locally.
|
||||
|
||||
The "node-to-node" shard repair continues until it runs on every data node that owns the shard in need of repair.
|
||||
|
||||
### Repair order
|
||||
|
||||
Repairs between shard owners happen in a deterministic order.
|
||||
This doesn't mean repairs always start on node 1 and then follow a specific node order.
|
||||
Repairs are viewed at the shard level.
|
||||
Each shard has a list of owners and the repairs for a particular shard will happen
|
||||
in a deterministic order among its owners.
|
||||
|
||||
When the Anti-Entropy service on any data node receives a repair request for a shard, it determines which
|
||||
owner node is the first in the deterministic order and forwards the request to that node.
|
||||
The request is now queued on the first owner.
|
||||
|
||||
The first owner's repair processor pulls it from the queue, detects the differences
|
||||
between the local copy of the shard with the copy of the same shard on the next
|
||||
owner in the deterministic order, then generates a patch from that difference.
|
||||
The first owner then makes an RPC call to the next owner instructing it to apply
|
||||
the patch to its copy of the shard.
|
||||
|
||||
Once the next owner has successfully applied the patch, it adds that shard to the Anti-Entropy repair queue.
|
||||
A list of "visited" nodes follows the repair through the list of owners.
|
||||
Each owner will check the list to detect when the repair has cycled through all owners,
|
||||
at which point the repair is finished.
|
||||
|
||||
### Hot shards
|
||||
|
||||
The Anti-Entropy service does its best to avoid hot shards (shards that are currently receiving writes)
|
||||
because they change quickly.
|
||||
While write replication between shard owner nodes (with a
|
||||
[replication factor](/enterprise_influxdb/v1.9/concepts/glossary/#replication-factor)
|
||||
greater than 1) typically happens in milliseconds, this slight difference is
|
||||
still enough to cause the appearance of entropy where there is none.
|
||||
|
||||
Because the Anti-Entropy service repairs only cold shards, unexpected effects can occur.
|
||||
Consider the following scenario:
|
||||
|
||||
1. A shard goes cold.
|
||||
2. Anti-Entropy detects entropy.
|
||||
3. Entropy is reported by the [Anti-Entropy `/status` API](/enterprise_influxdb/v1.9/administration/anti-entropy-api/#get-status) or with the `influxd-ctl entropy show` command.
|
||||
4. Shard takes a write, gets compacted, or something else causes it to go hot.
|
||||
_These actions are out of Anti-Entropy's control._
|
||||
5. A repair is requested, but is ignored because the shard is now hot.
|
||||
|
||||
In this example, you would have to periodically request a repair of the shard
|
||||
until it either shows as being in the queue, being repaired, or no longer in the list of shards with entropy.
|
||||
|
||||
## Configuration
|
||||
|
||||
The configuration settings for the Anti-Entropy service are described in [Anti-Entropy settings](/enterprise_influxdb/v1.9/administration/config-data-nodes#anti-entropy) section of the data node configuration.
|
||||
|
||||
To enable the Anti-Entropy service, change the default value of the `[anti-entropy].enabled = false` setting to `true` in the `influxdb.conf` file of each of your data nodes.
|
||||
|
||||
## Command line tools for managing entropy
|
||||
|
||||
>**Note:** The Anti-Entropy service is disabled by default and must be enabled before using these commands.
|
||||
|
||||
The `influxd-ctl entropy` command enables you to manage entropy among shards in a cluster.
|
||||
It includes the following subcommands:
|
||||
|
||||
#### `show`
|
||||
|
||||
Lists shards that are in an inconsistent state and in need of repair as well as
|
||||
shards currently in the repair queue.
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy show
|
||||
```
|
||||
|
||||
#### `repair`
|
||||
|
||||
Queues a shard for repair.
|
||||
It requires a Shard ID which is provided in the [`show`](#show) output.
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy repair <shardID>
|
||||
```
|
||||
|
||||
Repairing entropy in a shard is an asynchronous operation.
|
||||
This command will return quickly as it only adds a shard to the repair queue.
|
||||
Queuing shards for repair is idempotent.
|
||||
There is no harm in making multiple requests to repair the same shard even if
|
||||
it is already queued, currently being repaired, or not in need of repair.
|
||||
|
||||
#### `kill-repair`
|
||||
|
||||
Removes a shard from the repair queue.
|
||||
It requires a Shard ID which is provided in the [`show`](#show) output.
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy kill-repair <shardID>
|
||||
```
|
||||
|
||||
This only applies to shards in the repair queue.
|
||||
It does not cancel repairs on nodes that are in the process of being repaired.
|
||||
Once a repair has started, requests to cancel it are ignored.
|
||||
|
||||
> Stopping a entropy repair for a **missing** shard operation is not currently supported.
|
||||
> It may be possible to stop repairs for missing shards with the
|
||||
> [`influxd-ctl kill-copy-shard`](/enterprise_influxdb/v1.9/tools/influxd-ctl/#kill-copy-shard) command.
|
||||
|
||||
## InfluxDB Anti-Entropy API
|
||||
|
||||
The Anti-Entropy service uses an API for managing and monitoring entropy.
|
||||
Details on the available API endpoints can be found in [The InfluxDB Anti-Entropy API](/enterprise_influxdb/v1.9/administration/anti-entropy-api).
|
||||
|
||||
## Use cases
|
||||
|
||||
Common use cases for the Anti-Entropy service include detecting and repairing entropy, replacing unresponsive data nodes, replacing data nodes for upgrades and maintenance, and eliminating entropy in active shards.
|
||||
|
||||
### Detecting and repairing entropy
|
||||
|
||||
Periodically, you may want to see if shards in your cluster have entropy or are
|
||||
inconsistent with other shards in the shard group.
|
||||
Use the `influxd-ctl entropy show` command to list all shards with detected entropy:
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy show
|
||||
|
||||
Entropy
|
||||
==========
|
||||
ID Database Retention Policy Start End Expires Status
|
||||
21179 statsdb 1hour 2017-10-09 00:00:00 +0000 UTC 2017-10-16 00:00:00 +0000 UTC 2018-10-22 00:00:00 +0000 UTC diff
|
||||
25165 statsdb 1hour 2017-11-20 00:00:00 +0000 UTC 2017-11-27 00:00:00 +0000 UTC 2018-12-03 00:00:00 +0000 UTC diff
|
||||
```
|
||||
|
||||
Then use the `influxd-ctl entropy repair` command to add the shards with entropy
|
||||
to the repair queue:
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy repair 21179
|
||||
|
||||
Repair Shard 21179 queued
|
||||
|
||||
influxd-ctl entropy repair 25165
|
||||
|
||||
Repair Shard 25165 queued
|
||||
```
|
||||
|
||||
Check on the status of the repair queue with the `influxd-ctl entropy show` command:
|
||||
|
||||
```bash
|
||||
influxd-ctl entropy show
|
||||
|
||||
Entropy
|
||||
==========
|
||||
ID Database Retention Policy Start End Expires Status
|
||||
21179 statsdb 1hour 2017-10-09 00:00:00 +0000 UTC 2017-10-16 00:00:00 +0000 UTC 2018-10-22 00:00:00 +0000 UTC diff
|
||||
25165 statsdb 1hour 2017-11-20 00:00:00 +0000 UTC 2017-11-27 00:00:00 +0000 UTC 2018-12-03 00:00:00 +0000 UTC diff
|
||||
|
||||
Queued Shards: [21179 25165]
|
||||
```
|
||||
|
||||
### Replacing an unresponsive data node
|
||||
|
||||
If a data node suddenly disappears due to a catastrophic hardware failure or for any other reason, as soon as a new data node is online, the Anti-Entropy service will copy the correct shards to the new replacement node. The time it takes for the copying to complete is determined by the number of shards to be copied and how much data is stored in each.
|
||||
|
||||
_View the [Replacing Data Nodes](/enterprise_influxdb/v1.9/guides/replacing-nodes/#replace-data-nodes-in-an-influxdb-enterprise-cluster) documentation for instructions on replacing data nodes in your InfluxDB Enterprise cluster._
|
||||
|
||||
### Replacing a machine that is running a data node
|
||||
|
||||
Perhaps you are replacing a machine that is being decommissioned, upgrading hardware, or something else entirely.
|
||||
The Anti-Entropy service will automatically copy shards to the new machines.
|
||||
|
||||
Once you have successfully run the `influxd-ctl update-data` command, you are free
|
||||
to shut down the retired node without causing any interruption to the cluster.
|
||||
The Anti-Entropy process will continue copying the appropriate shards from the
|
||||
remaining replicas in the cluster.
|
||||
|
||||
### Fixing entropy in active shards
|
||||
|
||||
In rare cases, the currently active shard, or the shard to which new data is
|
||||
currently being written, may find itself with inconsistent data.
|
||||
Because the Anti-Entropy process can't write to hot shards, you must stop writes to the new
|
||||
shard using the [`influxd-ctl truncate-shards` command](/enterprise_influxdb/v1.9/tools/influxd-ctl/#truncate-shards),
|
||||
then add the inconsistent shard to the entropy repair queue:
|
||||
|
||||
```bash
|
||||
# Truncate hot shards
|
||||
influxd-ctl truncate-shards
|
||||
|
||||
# Show shards with entropy
|
||||
influxd-ctl entropy show
|
||||
|
||||
Entropy
|
||||
==========
|
||||
ID Database Retention Policy Start End Expires Status
|
||||
21179 statsdb 1hour 2018-06-06 12:00:00 +0000 UTC 2018-06-06 23:44:12 +0000 UTC 2018-12-06 00:00:00 +0000 UTC diff
|
||||
|
||||
# Add the inconsistent shard to the repair queue
|
||||
influxd-ctl entropy repair 21179
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Queued repairs are not being processed
|
||||
|
||||
The primary reason a repair in the repair queue isn't being processed is because
|
||||
it went "hot" after the repair was queued.
|
||||
The Anti-Entropy service only repairs cold shards or shards that are not currently being written to.
|
||||
If the shard is hot, the Anti-Entropy service will wait until it goes cold again before performing the repair.
|
||||
|
||||
If the shard is "old" and writes to it are part of a backfill process, you simply
|
||||
have to wait until the backfill process is finished. If the shard is the active
|
||||
shard, run `truncate-shards` to stop writes to active shards. This process is
|
||||
outlined [above](#fixing-entropy-in-active-shards).
|
||||
|
||||
### Anti-Entropy log messages
|
||||
|
||||
Below are common messages output by Anti-Entropy along with what they mean.
|
||||
|
||||
#### `Checking status`
|
||||
|
||||
Indicates that the Anti-Entropy process has begun the [status check process](#detecting-entropy).
|
||||
|
||||
#### `Skipped shards`
|
||||
|
||||
Indicates that the Anti-Entropy process has skipped a status check on shards because they are currently [hot](#hot-shards).
|
|
@ -0,0 +1,238 @@
|
|||
---
|
||||
title: InfluxDB Anti-Entropy API
|
||||
description: >
|
||||
Monitor and repair shards on InfluxDB Enterprise data nodes the InfluxDB Anti-Entropy API.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Anti-entropy API
|
||||
weight: 70
|
||||
parent: Use Anti-entropy service
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/administration/anti-entropy-api/
|
||||
---
|
||||
|
||||
>**Note:** The Anti-Entropy API is available from the meta nodes and is only available when the Anti-Entropy service is enabled in the data node configuration settings. For information on the configuration settings, see
|
||||
> [Anti-Entropy settings](/enterprise_influxdb/v1.9/administration/config-data-nodes/#anti-entropy-ae-settings).
|
||||
|
||||
Use the [Anti-Entropy service](/enterprise_influxdb/v1.9/administration/anti-entropy) in InfluxDB Enterprise to monitor and repair entropy in data nodes and their shards. To access the Anti-Entropy API and work with this service, use [`influx-ctl entropy`](/enterprise_influxdb/v1.9/tools/influxd-ctl/#entropy) (also available on meta nodes).
|
||||
|
||||
The base URL is:
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair
|
||||
```
|
||||
|
||||
## GET `/status`
|
||||
|
||||
### Description
|
||||
|
||||
Lists shards that are in an inconsistent state and in need of repair.
|
||||
|
||||
### Parameters
|
||||
|
||||
| Name | Located in | Description | Required | Type |
|
||||
| ---- | ---------- | ----------- | -------- | ---- |
|
||||
| `local` | query | Limits status check to local shards on the data node handling this request | No | boolean |
|
||||
|
||||
### Responses
|
||||
|
||||
#### Headers
|
||||
|
||||
| Header name | Value |
|
||||
|-------------|--------------------|
|
||||
| `Accept` | `application/json` |
|
||||
|
||||
#### Status codes
|
||||
|
||||
| Code | Description | Type |
|
||||
| ---- | ----------- | ------ |
|
||||
| `200` | `Successful operation` | object |
|
||||
|
||||
### Examples
|
||||
|
||||
#### cURL request
|
||||
|
||||
```bash
|
||||
curl -X GET "http://localhost:8086/shard-repair/status?local=true" -H "accept: application/json"
|
||||
```
|
||||
|
||||
#### Request URL
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair/status?local=true
|
||||
|
||||
```
|
||||
|
||||
### Responses
|
||||
|
||||
Example of server response value:
|
||||
|
||||
```json
|
||||
{
|
||||
"shards": [
|
||||
{
|
||||
"id": "1",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "-259200000000000",
|
||||
"end_time": "345600000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
},
|
||||
{
|
||||
"id": "3",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "62640000000000000",
|
||||
"end_time": "63244800000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
}
|
||||
],
|
||||
"queued_shards": [
|
||||
"3",
|
||||
"5",
|
||||
"9"
|
||||
],
|
||||
"processing_shards": [
|
||||
"3",
|
||||
"9"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## POST `/repair`
|
||||
|
||||
### Description
|
||||
|
||||
Queues the specified shard for repair of the inconsistent state.
|
||||
|
||||
### Parameters
|
||||
|
||||
| Name | Located in | Description | Required | Type |
|
||||
| ---- | ---------- | ----------- | -------- | ---- |
|
||||
| `id` | query | ID of shard to queue for repair | Yes | integer |
|
||||
|
||||
### Responses
|
||||
|
||||
#### Headers
|
||||
|
||||
| Header name | Value |
|
||||
| ----------- | ----- |
|
||||
| `Accept` | `application/json` |
|
||||
|
||||
#### Status codes
|
||||
|
||||
| Code | Description |
|
||||
| ---- | ----------- |
|
||||
| `204` | `Successful operation` |
|
||||
| `400` | `Bad request` |
|
||||
| `500` | `Internal server error` |
|
||||
|
||||
### Examples
|
||||
|
||||
#### cURL request
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8086/shard-repair/repair?id=1" -H "accept: application/json"
|
||||
```
|
||||
|
||||
#### Request URL
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair/repair?id=1
|
||||
```
|
||||
|
||||
## POST `/cancel-repair`
|
||||
|
||||
### Description
|
||||
|
||||
Removes the specified shard from the repair queue on nodes.
|
||||
|
||||
### Parameters
|
||||
|
||||
| Name | Located in | Description | Required | Type |
|
||||
| ---- | ---------- | ----------- | -------- | ---- |
|
||||
| `id` | query | ID of shard to remove from repair queue | Yes | integer |
|
||||
| `local` | query | Only remove shard from repair queue on node receiving the request | No | boolean |
|
||||
|
||||
### Responses
|
||||
|
||||
#### Headers
|
||||
|
||||
| Header name | Value |
|
||||
|-------------|--------------------|
|
||||
| `Accept` | `application/json` |
|
||||
|
||||
#### Status codes
|
||||
|
||||
| Code | Description |
|
||||
| ---- | ----------- |
|
||||
| `204` | `Successful operation` |
|
||||
| `400` | `Bad request` |
|
||||
| `500` | `Internal server error` |
|
||||
|
||||
### Examples
|
||||
|
||||
#### cURL request
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8086/shard-repair/cancel-repair?id=1&local=false" -H "accept: application/json"
|
||||
```
|
||||
|
||||
#### Request URL
|
||||
|
||||
```text
|
||||
http://localhost:8086/shard-repair/cancel-repair?id=1&local=false
|
||||
```
|
||||
|
||||
## Models
|
||||
|
||||
### ShardStatus
|
||||
|
||||
| Name | Type | Required |
|
||||
| ---- | ---- | -------- |
|
||||
| `id` | string | No |
|
||||
| `database` | string | No |
|
||||
| `retention_policy` | string | No |
|
||||
| `start_time` | string | No |
|
||||
| `end_time` | string | No |
|
||||
| `expires` | string | No |
|
||||
| `status` | string | No |
|
||||
|
||||
### Examples
|
||||
|
||||
|
||||
```json
|
||||
{
|
||||
"shards": [
|
||||
{
|
||||
"id": "1",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "-259200000000000",
|
||||
"end_time": "345600000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
},
|
||||
{
|
||||
"id": "3",
|
||||
"database": "ae",
|
||||
"retention_policy": "autogen",
|
||||
"start_time": "62640000000000000",
|
||||
"end_time": "63244800000000000",
|
||||
"expires": "0",
|
||||
"status": "diff"
|
||||
}
|
||||
],
|
||||
"queued_shards": [
|
||||
"3",
|
||||
"5",
|
||||
"9"
|
||||
],
|
||||
"processing_shards": [
|
||||
"3",
|
||||
"9"
|
||||
]
|
||||
}
|
||||
```
|
|
@ -0,0 +1,467 @@
|
|||
---
|
||||
title: Back up and restore InfluxDB Enterprise clusters
|
||||
description: >
|
||||
Back up and restore InfluxDB enterprise clusters in case of unexpected data loss.
|
||||
aliases:
|
||||
- /enterprise/v1.8/guides/backup-and-restore/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Back up and restore
|
||||
weight: 80
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
When deploying InfluxDB Enterprise in production environments, you should have a strategy and procedures for backing up and restoring your InfluxDB Enterprise clusters to be prepared for unexpected data loss.
|
||||
|
||||
The tools provided by InfluxDB Enterprise can be used to:
|
||||
|
||||
- Provide disaster recovery due to unexpected events
|
||||
- Migrate data to new environments or servers
|
||||
- Restore clusters to a consistent state
|
||||
- Debugging
|
||||
|
||||
Depending on the volume of data to be protected and your application requirements, InfluxDB Enterprise offers two methods, described below, for managing backups and restoring data:
|
||||
|
||||
- [Backup and restore utilities](#backup-and-restore-utilities) — For most applications
|
||||
- [Exporting and importing data](#exporting-and-importing-data) — For large datasets
|
||||
|
||||
> **Note:** Use the [`backup` and `restore` utilities (InfluxDB OSS 1.5 and later)](/{{< latest "influxdb" "v1" >}}/administration/backup_and_restore/) to:
|
||||
>
|
||||
> - Restore InfluxDB Enterprise backup files to InfluxDB OSS instances.
|
||||
> - Back up InfluxDB OSS data that can be restored in InfluxDB Enterprise clusters.
|
||||
|
||||
## Backup and restore utilities
|
||||
|
||||
InfluxDB Enterprise supports backing up and restoring data in a cluster, a single database, a single database and retention policy, and single shards. Most InfluxDB Enterprise applications can use the backup and restore utilities.
|
||||
|
||||
Use the `backup` and `restore` utilities to back up and restore between `influxd` instances with the same versions or with only minor version differences. For example, you can backup from 1.7.3 and restore on 1.8.2.
|
||||
|
||||
### Backup utility
|
||||
|
||||
A backup creates a copy of the [metastore](/enterprise_influxdb/v1.9/concepts/glossary/#metastore) and [shard](/enterprise_influxdb/v1.9/concepts/glossary/#shard) data at that point in time and stores the copy in the specified directory.
|
||||
|
||||
Or, back up **only the cluster metastore** using the `-strategy only-meta` backup option. For more information, see [perform a metastore only backup](#perform-a-metastore-only-backup).
|
||||
|
||||
All backups include a manifest, a JSON file describing what was collected during the backup.
|
||||
The filenames reflect the UTC timestamp of when the backup was created, for example:
|
||||
|
||||
- Metastore backup: `20060102T150405Z.meta` (includes usernames and passwords)
|
||||
- Shard data backup: `20060102T150405Z.<shard_id>.tar.gz`
|
||||
- Manifest: `20060102T150405Z.manifest`
|
||||
|
||||
Backups can be full, metastore only, or incremental, and they are incremental by default:
|
||||
|
||||
- **Full backup**: Creates a copy of the metastore and shard data.
|
||||
- **Incremental backup**: Creates a copy of the metastore and shard data that have changed since the last incremental backup. If there are no existing incremental backups, the system automatically performs a complete backup.
|
||||
- **Metastore only backup**: Creates a copy of the metastore data only.
|
||||
|
||||
Restoring different types of backups requires different syntax.
|
||||
To prevent issues with [restore](#restore-utility), keep full backups, metastore only backups, and incremental backups in separate directories.
|
||||
|
||||
>**Note:** The backup utility copies all data through the meta node that is used to
|
||||
execute the backup. As a result, performance of a backup and restore is typically limited by the network IO of the meta node. Increasing the resources available to this meta node (such as resizing the EC2 instance) can significantly improve backup and restore performance.
|
||||
|
||||
#### Syntax
|
||||
|
||||
```bash
|
||||
influxd-ctl [global-options] backup [backup-options] <path-to-backup-directory>
|
||||
```
|
||||
|
||||
> **Note:** The `influxd-ctl backup` command exits with `0` for success and `1` for failure. If the backup fails, output can be directed to a log file to troubleshoot.
|
||||
|
||||
##### Global options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.9/tools/influxd-ctl/#global-options)
|
||||
for a complete list of the global `influxd-ctl` options.
|
||||
|
||||
##### Backup options
|
||||
|
||||
- `-db <string>`: name of the single database to back up
|
||||
- `-from <TCP-address>`: the data node TCP address to prefer when backing up
|
||||
- `-strategy`: select the backup strategy to apply during backup
|
||||
- `incremental`: _**(Default)**_ backup only data added since the previous backup.
|
||||
- `full` perform a full backup. Same as `-full`
|
||||
- `only-meta` perform a backup for meta data only: users, roles,
|
||||
databases, continuous queries, retention policies. Shards are not exported.
|
||||
- `-full`: perform a full backup. Deprecated in favour of `-strategy=full`
|
||||
- `-rp <string>`: the name of the single retention policy to back up (must specify `-db` with `-rp`)
|
||||
- `-shard <unit>`: the ID of the single shard to back up
|
||||
|
||||
### Backup examples
|
||||
|
||||
Store the following incremental backups in different directories.
|
||||
The first backup specifies `-db myfirstdb` and the second backup specifies
|
||||
different options: `-db myfirstdb` and `-rp autogen`.
|
||||
|
||||
```bash
|
||||
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
|
||||
|
||||
influxd-ctl backup -db myfirstdb -rp autogen ./myfirstdb-autogen-backup
|
||||
```
|
||||
|
||||
Store the following incremental backups in the same directory.
|
||||
Both backups specify the same `-db` flag and the same database.
|
||||
|
||||
```bash
|
||||
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
|
||||
|
||||
influxd-ctl backup -db myfirstdb ./myfirstdb-allrp-backup
|
||||
```
|
||||
|
||||
#### Perform an incremental backup
|
||||
|
||||
Perform an incremental backup into the current directory with the command below.
|
||||
If there are any existing backups the current directory, the system performs an incremental backup.
|
||||
If there aren't any existing backups in the current directory, the system performs a backup of all data in InfluxDB.
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl backup .
|
||||
|
||||
# Example
|
||||
$ influxd-ctl backup .
|
||||
Backing up meta data... Done. 421 bytes transferred
|
||||
Backing up node 7ba671c7644b:8088, db telegraf, rp autogen, shard 4... Done. Backed up in 903.539567ms, 307712 bytes transferred
|
||||
Backing up node bf5a5f73bad8:8088, db _internal, rp monitor, shard 1... Done. Backed up in 138.694402ms, 53760 bytes transferred
|
||||
Backing up node 9bf0fa0c302a:8088, db _internal, rp monitor, shard 2... Done. Backed up in 101.791148ms, 40448 bytes transferred
|
||||
Backing up node 7ba671c7644b:8088, db _internal, rp monitor, shard 3... Done. Backed up in 144.477159ms, 39424 bytes transferred
|
||||
Backed up to . in 1.293710883s, transferred 441765 bytes
|
||||
$ ls
|
||||
20160803T222310Z.manifest 20160803T222310Z.s1.tar.gz 20160803T222310Z.s3.tar.gz
|
||||
20160803T222310Z.meta 20160803T222310Z.s2.tar.gz 20160803T222310Z.s4.tar.gz
|
||||
```
|
||||
|
||||
#### Perform a full backup
|
||||
|
||||
Perform a full backup into a specific directory with the command below.
|
||||
The directory must already exist.
|
||||
|
||||
```bash
|
||||
# Sytnax
|
||||
influxd-ctl backup -full <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl backup -full backup_dir
|
||||
Backing up meta data... Done. 481 bytes transferred
|
||||
Backing up node <hostname>:8088, db _internal, rp monitor, shard 1... Done. Backed up in 33.207375ms, 238080 bytes transferred
|
||||
Backing up node <hostname>:8088, db telegraf, rp autogen, shard 2... Done. Backed up in 15.184391ms, 95232 bytes transferred
|
||||
Backed up to backup_dir in 51.388233ms, transferred 333793 bytes
|
||||
$ ls backup_dir
|
||||
20170130T184058Z.manifest
|
||||
20170130T184058Z.meta
|
||||
20170130T184058Z.s1.tar.gz
|
||||
20170130T184058Z.s2.tar.gz
|
||||
```
|
||||
|
||||
#### Perform an incremental backup on a single database
|
||||
|
||||
Point at a remote meta server and back up only one database into a given directory (the directory must already exist):
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl -bind <metahost>:8091 backup -db <db-name> <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl -bind 2a1b7a338184:8091 backup -db telegraf ./telegrafbackup
|
||||
Backing up meta data... Done. 318 bytes transferred
|
||||
Backing up node 7ba671c7644b:8088, db telegraf, rp autogen, shard 4... Done. Backed up in 997.168449ms, 399872 bytes transferred
|
||||
Backed up to ./telegrafbackup in 1.002358077s, transferred 400190 bytes
|
||||
$ ls ./telegrafbackup
|
||||
20160803T222811Z.manifest 20160803T222811Z.meta 20160803T222811Z.s4.tar.gz
|
||||
```
|
||||
|
||||
#### Perform a metastore only backup
|
||||
|
||||
Perform a meta store only backup into a specific directory with the command below.
|
||||
The directory must already exist.
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl backup -strategy only-meta <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl backup -strategy only-meta backup_dir
|
||||
Backing up meta data... Done. 481 bytes transferred
|
||||
Backed up to backup_dir in 51.388233ms, transferred 481 bytes
|
||||
~# ls backup_dir
|
||||
20170130T184058Z.manifest
|
||||
20170130T184058Z.meta
|
||||
```
|
||||
|
||||
### Restore utility
|
||||
|
||||
#### Disable anti-entropy (AE) before restoring a backup
|
||||
|
||||
> Before restoring a backup, stop the anti-entropy (AE) service (if enabled) on **each data node in the cluster, one at a time**.
|
||||
|
||||
>
|
||||
> 1. Stop the `influxd` service.
|
||||
> 2. Set `[anti-entropy].enabled` to `false` in the influx configuration file (by default, influx.conf).
|
||||
> 3. Restart the `influxd` service and wait for the data node to receive read and write requests and for the [hinted handoff queue](/enterprise_influxdb/v1.9/concepts/clustering/#hinted-handoff) to drain.
|
||||
> 4. Once AE is disabled on all data nodes and each node returns to a healthy state, you're ready to restore the backup. For details on how to restore your backup, see examples below.
|
||||
> 5. After restoring the backup, restart AE services on each data node.
|
||||
|
||||
##### Restore a backup
|
||||
|
||||
Restore a backup to an existing cluster or a new cluster.
|
||||
By default, a restore writes to databases using the backed-up data's [replication factor](/enterprise_influxdb/v1.9/concepts/glossary/#replication-factor).
|
||||
An alternate replication factor can be specified with the `-newrf` flag when restoring a single database.
|
||||
Restore supports both `-full` backups and incremental backups; the syntax for
|
||||
a restore differs depending on the backup type.
|
||||
|
||||
##### Restores from an existing cluster to a new cluster
|
||||
|
||||
Restores from an existing cluster to a new cluster restore the existing cluster's
|
||||
[users](/enterprise_influxdb/v1.9/concepts/glossary/#user), roles,
|
||||
[databases](/enterprise_influxdb/v1.9/concepts/glossary/#database), and
|
||||
[continuous queries](/enterprise_influxdb/v1.9/concepts/glossary/#continuous-query-cq) to
|
||||
the new cluster.
|
||||
|
||||
They do not restore Kapacitor [subscriptions](/enterprise_influxdb/v1.9/concepts/glossary/#subscription).
|
||||
In addition, restores to a new cluster drop any data in the new cluster's
|
||||
`_internal` database and begin writing to that database anew.
|
||||
The restore does not write the existing cluster's `_internal` database to
|
||||
the new cluster.
|
||||
|
||||
#### Syntax to restore from incremental and metadata backups
|
||||
|
||||
Use the syntax below to restore an incremental or metadata backup to a new cluster or an existing cluster.
|
||||
**The existing cluster must contain no data in the affected databases.**
|
||||
Performing a restore from an incremental backup requires the path to the incremental backup's directory.
|
||||
|
||||
```bash
|
||||
influxd-ctl [global-options] restore [restore-options] <path-to-backup-directory>
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
The existing cluster can have data in the `_internal` database (the database InfluxDB creates if
|
||||
[internal monitoring](/platform/monitoring/influxdata-platform/tools/measurements-internal) is enabled).
|
||||
The system automatically drops the `_internal` database when it performs a complete restore.
|
||||
{{% /note %}}
|
||||
|
||||
##### Global options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.9/tools/influxd-ctl/#global-options)
|
||||
for a complete list of the global `influxd-ctl` options.
|
||||
|
||||
##### Restore options
|
||||
|
||||
- `-db <string>`: the name of the single database to restore
|
||||
- `-list`: shows the contents of the backup
|
||||
- `-newdb <string>`: the name of the new database to restore to (must specify with `-db`)
|
||||
- `-newrf <int>`: the new replication factor to restore to (this is capped to the number of data nodes in the cluster)
|
||||
- `-newrp <string>`: the name of the new retention policy to restore to (must specify with `-rp`)
|
||||
- `-rp <string>`: the name of the single retention policy to restore
|
||||
- `-shard <unit>`: the shard ID to restore
|
||||
|
||||
#### Syntax to restore from a full or manifest only backup
|
||||
|
||||
Use the syntax below to restore a full or manifest only backup to a new cluster or an existing cluster.
|
||||
Note that the existing cluster must contain no data in the affected databases.*
|
||||
Performing a restore requires the `-full` flag and the path to the backup's manifest file.
|
||||
|
||||
```bash
|
||||
influxd-ctl [global-options] restore [options] -full <path-to-manifest-file>
|
||||
```
|
||||
|
||||
\* The existing cluster can have data in the `_internal` database, the database
|
||||
that the system creates by default.
|
||||
The system automatically drops the `_internal` database when it performs a
|
||||
complete restore.
|
||||
|
||||
##### Global options
|
||||
|
||||
See the [`influxd-ctl` documentation](/enterprise_influxdb/v1.9/tools/influxd-ctl/#global-options)
|
||||
for a complete list of the global `influxd-ctl` options.
|
||||
|
||||
##### Restore options
|
||||
|
||||
- `-db <string>`: the name of the single database to restore
|
||||
- `-list`: shows the contents of the backup
|
||||
- `-newdb <string>`: the name of the new database to restore to (must specify with `-db`)
|
||||
- `-newrf <int>`: the new replication factor to restore to (this is capped to the number of data nodes in the cluster)
|
||||
- `-newrp <string>`: the name of the new retention policy to restore to (must specify with `-rp`)
|
||||
- `-rp <string>`: the name of the single retention policy to restore
|
||||
- `-shard <unit>`: the shard ID to restore
|
||||
|
||||
#### Examples
|
||||
|
||||
##### Restore from an incremental backup
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 2...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 2 in 61.046571ms, 588800 bytes transferred
|
||||
Restored from my-incremental-backup/ in 83.892591ms, transferred 588800 bytes
|
||||
```
|
||||
|
||||
##### Restore from a metadata backup
|
||||
|
||||
In this example, the `restore` command restores an metadata backup stored
|
||||
in the `metadata-backup/` directory.
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore metadata-backup/
|
||||
Using backup directory: metadata-backup/
|
||||
Using meta backup: 20200101T000000Z.meta
|
||||
Restoring meta data... Done. Restored in 21.373019ms, 1 shards mapped
|
||||
Restored from my-incremental-backup/ in 19.2311ms, transferred 588 bytes
|
||||
```
|
||||
|
||||
##### Restore from a `-full` backup
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore -full <path-to-manifest-file>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore -full my-full-backup/20170131T020341Z.manifest
|
||||
Using manifest: my-full-backup/20170131T020341Z.manifest
|
||||
Restoring meta data... Done. Restored in 9.585639ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 2...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 2 in 48.095082ms, 569344 bytes transferred
|
||||
Restored from my-full-backup in 58.58301ms, transferred 569344 bytes
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Restoring from a full backup **does not** restore metadata.
|
||||
To restore metadata, [restore a metadata backup](#restore-from-a-metadata-backup) separately.
|
||||
{{% /note %}}
|
||||
|
||||
##### Restore from an incremental backup for a single database and give the database a new name
|
||||
|
||||
```bash
|
||||
# Syntax
|
||||
influxd-ctl restore -db <src> -newdb <dest> <path-to-backup-directory>
|
||||
|
||||
# Example
|
||||
$ influxd-ctl restore -db telegraf -newdb restored_telegraf my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Done. Restored in 8.119655ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 4...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 4 in 57.89687ms, 588800 bytes transferred
|
||||
Restored from my-incremental-backup/ in 66.715524ms, transferred 588800 bytes
|
||||
```
|
||||
|
||||
##### Restore from an incremental backup for a database and merge that database into an existing database
|
||||
|
||||
Your `telegraf` database was mistakenly dropped, but you have a recent backup so you've only lost a small amount of data.
|
||||
|
||||
If Telegraf is still running, it will recreate the `telegraf` database shortly after the database is dropped.
|
||||
You might try to directly restore your `telegraf` backup just to find that you can't restore:
|
||||
|
||||
```bash
|
||||
$ influxd-ctl restore -db telegraf my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Error.
|
||||
restore: operation exited with error: problem setting snapshot: database already exists
|
||||
```
|
||||
|
||||
To work around this, you can restore your telegraf backup into a new database by specifying the `-db` flag for the source and the `-newdb` flag for the new destination:
|
||||
|
||||
```bash
|
||||
$ influxd-ctl restore -db telegraf -newdb restored_telegraf my-incremental-backup/
|
||||
Using backup directory: my-incremental-backup/
|
||||
Using meta backup: 20170130T231333Z.meta
|
||||
Restoring meta data... Done. Restored in 19.915242ms, 1 shards mapped
|
||||
Restoring db telegraf, rp autogen, shard 2 to shard 7...
|
||||
Copying data to <hostname>:8088... Copying data to <hostname>:8088... Done. Restored shard 2 into shard 7 in 36.417682ms, 588800 bytes transferred
|
||||
Restored from my-incremental-backup/ in 56.623615ms, transferred 588800 bytes
|
||||
```
|
||||
|
||||
Then, in the [`influx` client](/enterprise_influxdb/v1.9/tools/use-influx/), use an [`INTO` query](/enterprise_influxdb/v1.9/query_language/explore-data/#the-into-clause) to copy the data from the new database into the existing `telegraf` database:
|
||||
|
||||
```bash
|
||||
$ influx
|
||||
> USE restored_telegraf
|
||||
Using database restored_telegraf
|
||||
> SELECT * INTO telegraf..:MEASUREMENT FROM /.*/ GROUP BY *
|
||||
name: result
|
||||
------------
|
||||
time written
|
||||
1970-01-01T00:00:00Z 471
|
||||
```
|
||||
|
||||
#### Common issues with restore
|
||||
|
||||
##### Restore writes information not part of the original backup
|
||||
|
||||
If a [restore from an incremental backup](#syntax-to-restore-from-incremental-and-metadata-backups)
|
||||
does not limit the restore to the same database, retention policy, and shard specified by the backup command,
|
||||
the restore may appear to restore information that was not part of the original backup.
|
||||
Backups consist of a shard data backup and a metastore backup.
|
||||
The **shard data backup** contains the actual time series data: the measurements, tags, fields, and so on.
|
||||
The **metastore backup** contains user information, database names, retention policy names, shard metadata, continuous queries, and subscriptions.
|
||||
|
||||
When the system creates a backup, the backup includes:
|
||||
|
||||
* the relevant shard data determined by the specified backup options
|
||||
* all of the metastore information in the cluster regardless of the specified backup options
|
||||
|
||||
Because a backup always includes the complete metastore information, a restore that doesn't include the same options specified by the backup command may appear to restore data that were not targeted by the original backup.
|
||||
The unintended data, however, include only the metastore information, not the shard data associated with that metastore information.
|
||||
|
||||
##### Restore a backup created prior to version 1.2.0
|
||||
|
||||
InfluxDB Enterprise introduced incremental backups in version 1.2.0.
|
||||
To restore a backup created prior to version 1.2.0, be sure to follow the syntax
|
||||
for [restoring from a full backup](#restore-from-a-full-backup).
|
||||
|
||||
## Exporting and importing data
|
||||
|
||||
For most InfluxDB Enterprise applications, the [backup and restore utilities](#backup-and-restore-utilities) provide the tools you need for your backup and restore strategy. However, in some cases, the standard backup and restore utilities may not adequately handle the volumes of data in your application.
|
||||
|
||||
As an alternative to the standard backup and restore utilities, use the InfluxDB `influx_inspect export` and `influx -import` commands to create backup and restore procedures for your disaster recovery and backup strategy. These commands can be executed manually or included in shell scripts that run the export and import operations at scheduled intervals (example below).
|
||||
|
||||
### Exporting data
|
||||
|
||||
Use the [`influx_inspect export` command](/{{< latest "influxdb" "v1" >}}/tools/influx_inspect#export) to export data in line protocol format from your InfluxDB Enterprise cluster. Options include:
|
||||
|
||||
- Exporting all, or specific, databases
|
||||
- Filtering with starting and ending timestamps
|
||||
- Using gzip compression for smaller files and faster exports
|
||||
|
||||
For details on optional settings and usage, see [`influx_inspect export` command](/{{< latest "influxdb" "v1" >}}/tools/influx_inspect#export).
|
||||
|
||||
In the following example, the database is exported filtered to include only one day and compressed for optimal speed and file size.
|
||||
|
||||
```bash
|
||||
influx_inspect export -database myDB -compress -start 2019-05-19T00:00:00.000Z -end 2019-05-19T23:59:59.999Z
|
||||
```
|
||||
|
||||
### Importing data
|
||||
|
||||
After exporting the data in line protocol format, you can import the data using the [`influx -import` CLI command](/{{< latest "influxdb" "v1" >}}/tools/use-influx/#import).
|
||||
|
||||
In the following example, the compressed data file is imported into the specified database.
|
||||
|
||||
```bash
|
||||
influx -import -database myDB -compress
|
||||
```
|
||||
|
||||
For details on using the `influx -import` command, see [Import data from a file with -import](/{{< latest "influxdb" "v1" >}}/tools/use-influx/#import-data-from-a-file-with-import).
|
||||
|
||||
### Example
|
||||
|
||||
For an example of using the exporting and importing data approach for disaster recovery, see the Capital One presentation from Influxdays 2019 on ["Architecting for Disaster Recovery."](https://www.youtube.com/watch?v=LyQDhSdnm4A). In this presentation, Capital One discusses the following:
|
||||
|
||||
- Exporting data every 15 minutes from an active cluster to an AWS S3 bucket.
|
||||
- Replicating the export file in the S3 bucket using the AWS S3 copy command.
|
||||
- Importing data every 15 minutes from the AWS S3 bucket to a cluster available for disaster recovery.
|
||||
- Advantages of the export-import approach over the standard backup and restore utilities for large volumes of data.
|
||||
- Managing users and scheduled exports and imports with a custom administration tool.
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
title: Manage InfluxDB Enterprise clusters
|
||||
description: >
|
||||
Use the `influxd-ctl` and `influx` command line tools to manage InfluxDB Enterprise clusters and data.
|
||||
aliases:
|
||||
- /enterprise/v1.8/features/cluster-commands/
|
||||
- /enterprise_influxdb/v1.9/features/cluster-commands/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Manage clusters
|
||||
weight: 40
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
Use the following tools to manage and interact with your InfluxDB Enterprise clusters:
|
||||
|
||||
- To manage clusters and nodes, back up and restore data, and rebalance clusters, use the [`influxd-ctl` cluster management utility](#influxd-ctl-cluster-management-utility)
|
||||
- To write and query data, use the [`influx` command line interface (CLI)](#influx-command-line-interface-cli)
|
||||
|
||||
## `influxd-ctl` cluster management utility
|
||||
|
||||
The [`influxd-ctl`](/enterprise_influxdb/v1.9/tools/influxd-ctl/) utility provides commands for managing your InfluxDB Enterprise clusters.
|
||||
Use the `influxd-ctl` cluster management utility to manage your cluster nodes, back up and restore data, and rebalance clusters.
|
||||
The `influxd-ctl` utility is available on all [meta nodes](/enterprise_influxdb/v1.9/concepts/glossary/#meta-node).
|
||||
|
||||
For more information, see [`influxd-ctl`](/enterprise_influxdb/v1.9/tools/influxd-ctl/).
|
||||
|
||||
## `influx` command line interface (CLI)
|
||||
|
||||
Use the `influx` command line interface (CLI) to write data to your cluster, query data interactively, and view query output in different formats.
|
||||
The `influx` CLI is available on all [data nodes](/enterprise_influxdb/v1.9/concepts/glossary/#data-node).
|
||||
|
||||
See [InfluxDB command line interface (CLI/shell)](/enterprise_influxdb/v1.9/tools/use-influx/) for details on using the `influx` command line interface.
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,286 @@
|
|||
---
|
||||
title: Configure InfluxDB Enterprise meta modes
|
||||
description: >
|
||||
Configure InfluxDB Enterprise data node settings and environmental variables.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Configure meta nodes
|
||||
weight: 30
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
* [Meta node configuration settings](#meta-node-configuration-settings)
|
||||
* [Global options](#global-options)
|
||||
* [Enterprise license `[enterprise]`](#enterprise)
|
||||
* [Meta node `[meta]`](#meta)
|
||||
* [TLS `[tls]`](#tls-settings)
|
||||
|
||||
## Meta node configuration settings
|
||||
|
||||
### Global options
|
||||
|
||||
#### `reporting-disabled = false`
|
||||
|
||||
InfluxData, the company, relies on reported data from running nodes primarily to
|
||||
track the adoption rates of different InfluxDB versions.
|
||||
These data help InfluxData support the continuing development of InfluxDB.
|
||||
|
||||
The `reporting-disabled` option toggles the reporting of data every 24 hours to
|
||||
`usage.influxdata.com`.
|
||||
Each report includes a randomly-generated identifier, OS, architecture,
|
||||
InfluxDB version, and the number of databases, measurements, and unique series.
|
||||
To disable reporting, set this option to `true`.
|
||||
|
||||
> **Note:** No data from user databases are ever transmitted.
|
||||
|
||||
#### `bind-address = ""`
|
||||
|
||||
This setting is not intended for use.
|
||||
It will be removed in future versions.
|
||||
|
||||
#### `hostname = ""`
|
||||
|
||||
The hostname of the [meta node](/enterprise_influxdb/v1.9/concepts/glossary/#meta-node).
|
||||
This must be resolvable and reachable by all other members of the cluster.
|
||||
|
||||
Environment variable: `INFLUXDB_HOSTNAME`
|
||||
|
||||
-----
|
||||
|
||||
### Enterprise license settings
|
||||
#### `[enterprise]`
|
||||
|
||||
The `[enterprise]` section contains the parameters for the meta node's
|
||||
registration with the [InfluxData portal](https://portal.influxdata.com/).
|
||||
|
||||
#### `license-key = ""`
|
||||
|
||||
The license key created for you on [InfluxData portal](https://portal.influxdata.com).
|
||||
The meta node transmits the license key to
|
||||
[portal.influxdata.com](https://portal.influxdata.com) over port 80 or port 443
|
||||
and receives a temporary JSON license file in return.
|
||||
The server caches the license file locally.
|
||||
If your server cannot communicate with [https://portal.influxdata.com](https://portal.influxdata.com), you must use the [`license-path` setting](#license-path).
|
||||
|
||||
Use the same key for all nodes in the same cluster.
|
||||
{{% warn %}}The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
|
||||
{{% /warn %}}
|
||||
|
||||
> **Note:** You must restart meta nodes to update your configuration. For more information, see how to [renew or update your license key](/enterprise_influxdb/v1.9/administration/renew-license/).
|
||||
|
||||
Environment variable: `INFLUXDB_ENTERPRISE_LICENSE_KEY`
|
||||
|
||||
#### `license-path = ""`
|
||||
|
||||
The local path to the permanent JSON license file that you received from InfluxData
|
||||
for instances that do not have access to the internet.
|
||||
To obtain a license file, contact [sales@influxdb.com](mailto:sales@influxdb.com).
|
||||
|
||||
The license file must be saved on every server in the cluster, including meta nodes
|
||||
and data nodes.
|
||||
The file contains the JSON-formatted license, and must be readable by the `influxdb` user.
|
||||
Each server in the cluster independently verifies its license.
|
||||
|
||||
{{% warn %}}
|
||||
The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
|
||||
{{% /warn %}}
|
||||
|
||||
> **Note:** You must restart meta nodes to update your configuration. For more information, see how to [renew or update your license key](/enterprise_influxdb/v1.9/administration/renew-license/).
|
||||
|
||||
Environment variable: `INFLUXDB_ENTERPRISE_LICENSE_PATH`
|
||||
|
||||
-----
|
||||
### Meta node settings
|
||||
|
||||
#### `[meta]`
|
||||
|
||||
#### `dir = "/var/lib/influxdb/meta"`
|
||||
|
||||
The directory where cluster meta data is stored.
|
||||
|
||||
Environment variable: `INFLUXDB_META_DIR`
|
||||
|
||||
#### `bind-address = ":8089"`
|
||||
|
||||
The bind address(port) for meta node communication.
|
||||
For simplicity, InfluxData recommends using the same port on all meta nodes,
|
||||
but this is not necessary.
|
||||
|
||||
Environment variable: `INFLUXDB_META_BIND_ADDRESS`
|
||||
|
||||
#### `http-bind-address = ":8091"`
|
||||
|
||||
The default address to bind the API to.
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTP_BIND_ADDRESS`
|
||||
|
||||
#### `https-enabled = false`
|
||||
|
||||
Determines whether meta nodes use HTTPS to communicate with each other. By default, HTTPS is disabled. We strongly recommend enabling HTTPS.
|
||||
|
||||
To enable HTTPS, set https-enabled to `true`, specify the path to the SSL certificate `https-certificate = " "`, and specify the path to the SSL private key `https-private-key = ""`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_ENABLED`
|
||||
|
||||
#### `https-certificate = ""`
|
||||
|
||||
If HTTPS is enabled, specify the path to the SSL certificate.
|
||||
Use either:
|
||||
|
||||
* PEM-encoded bundle with both the certificate and key (`[bundled-crt-and-key].pem`)
|
||||
* Certificate only (`[certificate].crt`)
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_CERTIFICATE`
|
||||
|
||||
#### `https-private-key = ""`
|
||||
|
||||
If HTTPS is enabled, specify the path to the SSL private key.
|
||||
Use either:
|
||||
|
||||
* PEM-encoded bundle with both the certificate and key (`[bundled-crt-and-key].pem`)
|
||||
* Private key only (`[private-key].key`)
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_PRIVATE_KEY`
|
||||
|
||||
#### `https-insecure-tls = false`
|
||||
|
||||
Whether meta nodes will skip certificate validation communicating with each other over HTTPS.
|
||||
This is useful when testing with self-signed certificates.
|
||||
|
||||
Environment variable: `INFLUXDB_META_HTTPS_INSECURE_TLS`
|
||||
|
||||
#### `data-use-tls = false`
|
||||
|
||||
Whether to use TLS to communicate with data nodes.
|
||||
|
||||
#### `data-insecure-tls = false`
|
||||
|
||||
Whether meta nodes will skip certificate validation communicating with data nodes over TLS.
|
||||
This is useful when testing with self-signed certificates.
|
||||
|
||||
#### `gossip-frequency = "5s"`
|
||||
|
||||
The default frequency with which the node will gossip its known announcements.
|
||||
|
||||
#### `announcement-expiration = "30s"`
|
||||
|
||||
The default length of time an announcement is kept before it is considered too old.
|
||||
|
||||
#### `retention-autocreate = true`
|
||||
|
||||
Automatically create a default retention policy when creating a database.
|
||||
|
||||
#### `election-timeout = "1s"`
|
||||
|
||||
The amount of time in candidate state without a leader before we attempt an election.
|
||||
|
||||
#### `heartbeat-timeout = "1s"`
|
||||
|
||||
The amount of time in follower state without a leader before we attempt an election.
|
||||
|
||||
#### `leader-lease-timeout = "500ms"`
|
||||
|
||||
The leader lease timeout is the amount of time a Raft leader will remain leader
|
||||
if it does not hear from a majority of nodes.
|
||||
After the timeout the leader steps down to the follower state.
|
||||
Clusters with high latency between nodes may want to increase this parameter to
|
||||
avoid unnecessary Raft elections.
|
||||
|
||||
Environment variable: `INFLUXDB_META_LEADER_LEASE_TIMEOUT`
|
||||
|
||||
#### `commit-timeout = "50ms"`
|
||||
|
||||
The commit timeout is the amount of time a Raft node will tolerate between
|
||||
commands before issuing a heartbeat to tell the leader it is alive.
|
||||
The default setting should work for most systems.
|
||||
|
||||
Environment variable: `INFLUXDB_META_COMMIT_TIMEOUT`
|
||||
|
||||
#### `consensus-timeout = "30s"`
|
||||
|
||||
Timeout waiting for consensus before getting the latest Raft snapshot.
|
||||
|
||||
Environment variable: `INFLUXDB_META_CONSENSUS_TIMEOUT`
|
||||
|
||||
#### `cluster-tracing = false`
|
||||
|
||||
Cluster tracing toggles the logging of Raft logs on Raft nodes.
|
||||
Enable this setting when debugging Raft consensus issues.
|
||||
|
||||
Environment variable: `INFLUXDB_META_CLUSTER_TRACING`
|
||||
|
||||
#### `logging-enabled = true`
|
||||
|
||||
Meta logging toggles the logging of messages from the meta service.
|
||||
|
||||
Environment variable: `INFLUXDB_META_LOGGING_ENABLED`
|
||||
|
||||
#### `pprof-enabled = true`
|
||||
|
||||
Enables the `/debug/pprof` endpoint for troubleshooting.
|
||||
To disable, set the value to `false`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_PPROF_ENABLED`
|
||||
|
||||
#### `lease-duration = "1m0s"`
|
||||
|
||||
The default duration of the leases that data nodes acquire from the meta nodes.
|
||||
Leases automatically expire after the `lease-duration` is met.
|
||||
|
||||
Leases ensure that only one data node is running something at a given time.
|
||||
For example, [continuous queries](/enterprise_influxdb/v1.9/concepts/glossary/#continuous-query-cq)
|
||||
(CQs) use a lease so that all data nodes aren't running the same CQs at once.
|
||||
|
||||
For more details about `lease-duration` and its impact on continuous queries, see
|
||||
[Configuration and operational considerations on a cluster](/enterprise_influxdb/v1.9/features/clustering-features/#configuration-and-operational-considerations-on-a-cluster).
|
||||
|
||||
Environment variable: `INFLUXDB_META_LEASE_DURATION`
|
||||
|
||||
#### `auth-enabled = false`
|
||||
|
||||
If true, HTTP endpoints require authentication.
|
||||
This setting must have the same value as the data nodes' meta.meta-auth-enabled configuration.
|
||||
|
||||
#### `ldap-allowed = false`
|
||||
|
||||
Whether LDAP is allowed to be set.
|
||||
If true, you will need to use `influxd ldap set-config` and set enabled=true to use LDAP authentication.
|
||||
|
||||
#### `shared-secret = ""`
|
||||
|
||||
The shared secret to be used by the public API for creating custom JWT authentication.
|
||||
If you use this setting, set [`auth-enabled`](#auth-enabled-false) to `true`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_SHARED_SECRET`
|
||||
|
||||
#### `internal-shared-secret = ""`
|
||||
|
||||
The shared secret used by the internal API for JWT authentication for
|
||||
inter-node communication within the cluster.
|
||||
Set this to a long pass phrase.
|
||||
This value must be the same value as the
|
||||
[`[meta] meta-internal-shared-secret`](/enterprise_influxdb/v1.9/administration/config-data-nodes#meta-internal-shared-secret) in the data node configuration file.
|
||||
To use this option, set [`auth-enabled`](#auth-enabled-false) to `true`.
|
||||
|
||||
Environment variable: `INFLUXDB_META_INTERNAL_SHARED_SECRET`
|
||||
|
||||
### TLS settings
|
||||
|
||||
For more information, see [TLS settings for data nodes](/enterprise_influxdb/v1.9/administration/config-data-nodes#tls-settings).
|
||||
|
||||
#### Recommended "modern compatibility" cipher settings
|
||||
|
||||
```toml
|
||||
ciphers = [ "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
|
||||
"TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
|
||||
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
|
||||
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
|
||||
"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
|
||||
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"
|
||||
]
|
||||
|
||||
min-version = "tls1.2"
|
||||
|
||||
max-version = "tls1.2"
|
||||
|
||||
```
|
|
@ -0,0 +1,182 @@
|
|||
---
|
||||
title: Configure InfluxDB Enterprise clusters
|
||||
description: >
|
||||
Learn about global options, meta node options, data node options and other InfluxDB Enterprise configuration settings, including
|
||||
aliases:
|
||||
- /enterprise/v1.8/administration/configuration/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Configure clusters
|
||||
weight: 10
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
This page contains general information about configuring InfluxDB Enterprise clusters.
|
||||
For complete listings and descriptions of the configuration settings, see:
|
||||
|
||||
* [Configure data nodes](/enterprise_influxdb/v1.9/administration/config-data-nodes)
|
||||
* [Configure meta nodes](/enterprise_influxdb/v1.9/administration/config-meta-nodes)
|
||||
|
||||
## Use configuration files
|
||||
|
||||
### Display the default configurations
|
||||
|
||||
The following commands print out a TOML-formatted configuration with all
|
||||
available options set to their default values.
|
||||
|
||||
#### Meta node configuration
|
||||
|
||||
```bash
|
||||
influxd-meta config
|
||||
```
|
||||
|
||||
#### Data node configuration
|
||||
|
||||
```bash
|
||||
influxd config
|
||||
```
|
||||
|
||||
#### Create a configuration file
|
||||
|
||||
On POSIX systems, generate a new configuration file by redirecting the output
|
||||
of the command to a file.
|
||||
|
||||
New meta node configuration file:
|
||||
```
|
||||
influxd-meta config > /etc/influxdb/influxdb-meta-generated.conf
|
||||
```
|
||||
|
||||
New data node configuration file:
|
||||
```
|
||||
influxd config > /etc/influxdb/influxdb-generated.conf
|
||||
```
|
||||
|
||||
Preserve custom settings from older configuration files when generating a new
|
||||
configuration file with the `-config` option.
|
||||
For example, this overwrites any default configuration settings in the output
|
||||
file (`/etc/influxdb/influxdb.conf.new`) with the configuration settings from
|
||||
the file (`/etc/influxdb/influxdb.conf.old`) passed to `-config`:
|
||||
|
||||
```
|
||||
influxd config -config /etc/influxdb/influxdb.conf.old > /etc/influxdb/influxdb.conf.new
|
||||
```
|
||||
|
||||
#### Launch the process with a configuration file
|
||||
|
||||
There are two ways to launch the meta or data processes using your customized
|
||||
configuration file.
|
||||
|
||||
* Point the process to the desired configuration file with the `-config` option.
|
||||
|
||||
To start the meta node process with `/etc/influxdb/influxdb-meta-generate.conf`:
|
||||
|
||||
influxd-meta -config /etc/influxdb/influxdb-meta-generate.conf
|
||||
|
||||
To start the data node process with `/etc/influxdb/influxdb-generated.conf`:
|
||||
|
||||
influxd -config /etc/influxdb/influxdb-generated.conf
|
||||
|
||||
|
||||
* Set the environment variable `INFLUXDB_CONFIG_PATH` to the path of your
|
||||
configuration file and start the process.
|
||||
|
||||
To set the `INFLUXDB_CONFIG_PATH` environment variable and launch the data
|
||||
process using `INFLUXDB_CONFIG_PATH` for the configuration file path:
|
||||
|
||||
export INFLUXDB_CONFIG_PATH=/root/influxdb.generated.conf
|
||||
echo $INFLUXDB_CONFIG_PATH
|
||||
/root/influxdb.generated.conf
|
||||
influxd
|
||||
|
||||
If set, the command line `-config` path overrides any environment variable path.
|
||||
If you do not supply a configuration file, InfluxDB uses an internal default
|
||||
configuration (equivalent to the output of `influxd config` and `influxd-meta
|
||||
config`).
|
||||
|
||||
{{% warn %}} Note for 1.3, the influxd-meta binary, if no configuration is specified, will check the INFLUXDB_META_CONFIG_PATH.
|
||||
If that environment variable is set, the path will be used as the configuration file.
|
||||
If unset, the binary will check the ~/.influxdb and /etc/influxdb folder for an influxdb-meta.conf file.
|
||||
If it finds that file at either of the two locations, the first will be loaded as the configuration file automatically.
|
||||
<br>
|
||||
This matches a similar behavior that the open source and data node versions of InfluxDB already follow.
|
||||
{{% /warn %}}
|
||||
|
||||
Configure InfluxDB using the configuration file (`influxdb.conf`) and environment variables.
|
||||
The default value for each configuration setting is shown in the documentation.
|
||||
Commented configuration options use the default value.
|
||||
|
||||
Configuration settings with a duration value support the following duration units:
|
||||
|
||||
- `ns` _(nanoseconds)_
|
||||
- `us` or `µs` _(microseconds)_
|
||||
- `ms` _(milliseconds)_
|
||||
- `s` _(seconds)_
|
||||
- `m` _(minutes)_
|
||||
- `h` _(hours)_
|
||||
- `d` _(days)_
|
||||
- `w` _(weeks)_
|
||||
|
||||
### Environment variables
|
||||
|
||||
All configuration options can be specified in the configuration file or in
|
||||
environment variables.
|
||||
Environment variables override the equivalent options in the configuration
|
||||
file.
|
||||
If a configuration option is not specified in either the configuration file
|
||||
or in an environment variable, InfluxDB uses its internal default
|
||||
configuration.
|
||||
|
||||
In the sections below we name the relevant environment variable in the
|
||||
description for the configuration setting.
|
||||
Environment variables can be set in `/etc/default/influxdb-meta` and
|
||||
`/etc/default/influxdb`.
|
||||
|
||||
> **Note:**
|
||||
To set or override settings in a config section that allows multiple
|
||||
configurations (any section with double_brackets (`[[...]]`) in the header supports
|
||||
multiple configurations), the desired configuration must be specified by ordinal
|
||||
number.
|
||||
For example, for the first set of `[[graphite]]` environment variables,
|
||||
prefix the configuration setting name in the environment variable with the
|
||||
relevant position number (in this case: `0`):
|
||||
>
|
||||
INFLUXDB_GRAPHITE_0_BATCH_PENDING
|
||||
INFLUXDB_GRAPHITE_0_BATCH_SIZE
|
||||
INFLUXDB_GRAPHITE_0_BATCH_TIMEOUT
|
||||
INFLUXDB_GRAPHITE_0_BIND_ADDRESS
|
||||
INFLUXDB_GRAPHITE_0_CONSISTENCY_LEVEL
|
||||
INFLUXDB_GRAPHITE_0_DATABASE
|
||||
INFLUXDB_GRAPHITE_0_ENABLED
|
||||
INFLUXDB_GRAPHITE_0_PROTOCOL
|
||||
INFLUXDB_GRAPHITE_0_RETENTION_POLICY
|
||||
INFLUXDB_GRAPHITE_0_SEPARATOR
|
||||
INFLUXDB_GRAPHITE_0_TAGS
|
||||
INFLUXDB_GRAPHITE_0_TEMPLATES
|
||||
INFLUXDB_GRAPHITE_0_UDP_READ_BUFFER
|
||||
>
|
||||
For the Nth Graphite configuration in the configuration file, the relevant
|
||||
environment variables would be of the form `INFLUXDB_GRAPHITE_(N-1)_BATCH_PENDING`.
|
||||
For each section of the configuration file the numbering restarts at zero.
|
||||
|
||||
### `GOMAXPROCS` environment variable
|
||||
|
||||
{{% note %}}
|
||||
_**Note:**_ `GOMAXPROCS` cannot be set using the InfluxDB configuration file.
|
||||
It can only be set as an environment variable.
|
||||
{{% /note %}}
|
||||
|
||||
The `GOMAXPROCS` [Go language environment variable](https://golang.org/pkg/runtime/#hdr-Environment_Variables)
|
||||
can be used to set the maximum number of CPUs that can execute simultaneously.
|
||||
|
||||
The default value of `GOMAXPROCS` is the number of CPUs
|
||||
that are visible to the program *on startup*
|
||||
(based on what the operating system considers to be a CPU).
|
||||
For a 32-core machine, the `GOMAXPROCS` value would be `32`.
|
||||
You can override this value to be less than the maximum value,
|
||||
which can be useful in cases where you are running the InfluxDB
|
||||
along with other processes on the same machine
|
||||
and want to ensure that the database doesn't negatively affect those processes.
|
||||
|
||||
{{% note %}}
|
||||
_**Note:**_ Setting `GOMAXPROCS=1` eliminates all parallelization.
|
||||
{{% /note %}}
|
|
@ -0,0 +1,211 @@
|
|||
---
|
||||
title: Configure LDAP authentication in InfluxDB Enterprise
|
||||
description: >
|
||||
Configure LDAP authentication in InfluxDB Enterprise and test LDAP connectivity.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Configure LDAP authentication
|
||||
weight: 40
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
Configure InfluxDB Enterprise to use LDAP (Lightweight Directory Access Protocol) to:
|
||||
|
||||
- Validate user permissions
|
||||
- Synchronize InfluxDB and LDAP so each LDAP request doesn't need to be queried
|
||||
|
||||
{{% note %}}
|
||||
To configure InfluxDB Enterprise to support LDAP, all users must be managed in the remote LDAP service.
|
||||
If LDAP is configured and enabled, users **must** authenticate through LDAP, including users who may have existed before enabling LDAP.
|
||||
{{% /note %}}
|
||||
|
||||
## Configure LDAP for an InfluxDB Enterprise cluster
|
||||
|
||||
To use LDAP with an InfluxDB Enterprise cluster, do the following:
|
||||
|
||||
1. [Configure data nodes](#configure-data-nodes)
|
||||
2. [Configure meta nodes](#configure-meta-nodes)
|
||||
3. [Create, verify, and upload the LDAP configuration file](#create-verify-and-upload-the-ldap-configuration-file)
|
||||
4. [Restart meta and data nodes](#restart-meta-and-data-nodes)
|
||||
|
||||
### Configure data nodes
|
||||
|
||||
Update the following settings in each data node configuration file (`/etc/influxdb/influxdb.conf`):
|
||||
|
||||
1. Under `[http]`, enable HTTP authentication by setting `auth-enabled` to `true`.
|
||||
(Or set the corresponding environment variable `INFLUXDB_HTTP_AUTH_ENABLED` to `true`.)
|
||||
2. Configure the HTTP shared secret to validate requests using JSON web tokens (JWT) and sign each HTTP payload with the secret and username.
|
||||
Set the `[http]` configuration setting for `shared-secret`, or the corresponding environment variable `INFLUXDB_HTTP_SHARED_SECRET`.
|
||||
3. If you're enabling authentication on meta nodes, you must also include the following configurations:
|
||||
- `INFLUXDB_META_META_AUTH_ENABLED` environment variable, or `[http]` configuration setting `meta-auth-enabled`, is set to `true`.
|
||||
This value must be the same value as the meta node's `meta.auth-enabled` configuration.
|
||||
- `INFLUXDB_META_META_INTERNAL_SHARED_SECRET`, or the corresponding `[meta]` configuration setting `meta-internal-shared-secret`, is set to `true`.
|
||||
This value must be the same value as the meta node's `meta.internal-shared-secret`.
|
||||
|
||||
### Configure meta nodes
|
||||
|
||||
Update the following settings in each meta node configuration file (`/etc/influxdb/influxdb-meta.conf`):
|
||||
|
||||
1. Configure the meta node META shared secret to validate requests using JSON web tokens (JWT) and sign each HTTP payload with the username and shared secret.
|
||||
2. Set the `[meta]` configuration setting `internal-shared-secret` to `"<internal-shared-secret>"`.
|
||||
(Or set the `INFLUXDB_META_INTERNAL_SHARED_SECRET` environment variable.)
|
||||
3. Set the `[meta]` configuration setting `meta.ldap-allowed` to `true` on all meta nodes in your cluster.
|
||||
(Or set the `INFLUXDB_META_LDAP_ALLOWED`environment variable.)
|
||||
|
||||
### Authenticate your connection to InfluxDB
|
||||
|
||||
To authenticate your InfluxDB connection, run the following command, replacing `username:password` with your credentials:
|
||||
|
||||
{{< keep-url >}}
|
||||
```bash
|
||||
curl -u username:password -XPOST "http://localhost:8086/..."
|
||||
```
|
||||
|
||||
For more detail on authentication, see [Authentication and authorization in InfluxDB](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/).
|
||||
|
||||
### Create, verify, and upload the LDAP configuration file
|
||||
|
||||
1. To create a sample LDAP configuration file, run the following command:
|
||||
|
||||
```bash
|
||||
influxd-ctl ldap sample-config
|
||||
```
|
||||
|
||||
2. Save the sample file and edit as needed for your LDAP server.
|
||||
For detail, see the [sample LDAP configuration file](#sample-ldap-configuration) below.
|
||||
|
||||
> To use fine-grained authorization (FGA) with LDAP, you must map InfluxDB Enterprise roles to key-value pairs in the LDAP database.
|
||||
For more information, see [Fine-grained authorization in InfluxDB Enterprise](/enterprise_influxdb/v1.9/guides/fine-grained-authorization/).
|
||||
The InfluxDB admin user doesn't include permissions for InfluxDB Enterprise roles.
|
||||
|
||||
3. Restart all meta and data nodes in your InfluxDB Enterprise cluster to load your updated configuration.
|
||||
|
||||
On each **meta** node, run:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
service influxdb-meta restart
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
On each **data** node, run:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[sysvinit](#)
|
||||
[systemd](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
service influxdb restart
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```sh
|
||||
sudo systemctl restart influxdb
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
|
||||
4. To verify your LDAP configuration, run:
|
||||
|
||||
```bash
|
||||
influxd-ctl ldap verify -ldap-config /path/to/ldap.toml
|
||||
```
|
||||
|
||||
5. To load your LDAP configuration file, run the following command:
|
||||
|
||||
```bash
|
||||
influxd-ctl ldap set-config /path/to/ldap.toml
|
||||
```
|
||||
|
||||
|
||||
## Sample LDAP configuration
|
||||
|
||||
The following is a sample configuration file that connects to a publicly available LDAP server.
|
||||
|
||||
A `DN` ("distinguished name") uniquely identifies an entry and describes its position in the directory information tree (DIT) hierarchy.
|
||||
The DN of an LDAP entry is similar to a file path on a file system.
|
||||
`DNs` refers to multiple DN entries.
|
||||
|
||||
{{% truncate %}}
|
||||
```toml
|
||||
enabled = true
|
||||
|
||||
[[servers]]
|
||||
enabled = true
|
||||
|
||||
[[servers]]
|
||||
host = "<LDAPserver>"
|
||||
port = 389
|
||||
|
||||
# Security mode for LDAP connection to this server.
|
||||
# The recommended security is set "starttls" by default. This uses an initial unencrypted connection
|
||||
# and upgrades to TLS as the first action against the server,
|
||||
# per the LDAPv3 standard.
|
||||
# Other options are "starttls+insecure" to behave the same as starttls
|
||||
# but skip server certificate verification, or "none" to use an unencrypted connection.
|
||||
security = "starttls"
|
||||
|
||||
# Credentials to use when searching for a user or group.
|
||||
bind-dn = "cn=read-only-admin,dc=example,dc=com"
|
||||
bind-password = "password"
|
||||
|
||||
# Base DNs to use when applying the search-filter to discover an LDAP user.
|
||||
search-base-dns = [
|
||||
"dc=example,dc=com",
|
||||
]
|
||||
|
||||
# LDAP filter to discover a user's DN.
|
||||
# %s will be replaced with the provided username.
|
||||
search-filter = "(uid=%s)"
|
||||
# On Active Directory you might use "(sAMAccountName=%s)".
|
||||
|
||||
# Base DNs to use when searching for groups.
|
||||
group-search-base-dns = ["dc=example,dc=com"]
|
||||
|
||||
# LDAP filter to identify groups that a user belongs to.
|
||||
# %s will be replaced with the user's DN.
|
||||
group-membership-search-filter = "(&(objectClass=groupOfUniqueNames)(uniqueMember=%s))"
|
||||
# On Active Directory you might use "(&(objectClass=group)(member=%s))".
|
||||
|
||||
# Attribute to use to determine the "group" in the group-mappings section.
|
||||
group-attribute = "ou"
|
||||
# On Active Directory you might use "cn".
|
||||
|
||||
# LDAP filter to search for a group with a particular name.
|
||||
# This is used when warming the cache to load group membership.
|
||||
group-search-filter = "(&(objectClass=groupOfUniqueNames)(cn=%s))"
|
||||
# On Active Directory you might use "(&(objectClass=group)(cn=%s))".
|
||||
|
||||
# Attribute of a group that contains the DNs of the group's members.
|
||||
group-member-attribute = "uniqueMember"
|
||||
# On Active Directory you might use "member".
|
||||
|
||||
# Create an administrator role in InfluxDB and then log in as a member of the admin LDAP group. Only members of a group with the administrator role can complete admin tasks.
|
||||
# For example, if tesla is the only member of the `italians` group, you must log in as tesla/password.
|
||||
admin-groups = ["italians"]
|
||||
|
||||
# These two roles would have to be created by hand if you want these LDAP group memberships to do anything.
|
||||
[[servers.group-mappings]]
|
||||
group = "mathematicians"
|
||||
role = "arithmetic"
|
||||
|
||||
[[servers.group-mappings]]
|
||||
group = "scientists"
|
||||
role = "laboratory"
|
||||
|
||||
```
|
||||
{{% /truncate %}}
|
|
@ -0,0 +1,133 @@
|
|||
---
|
||||
title: Log and trace InfluxDB Enterprise operations
|
||||
description: >
|
||||
Learn about logging locations, redirecting HTTP request logging, structured logging, and tracing.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Log and trace
|
||||
weight: 90
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
|
||||
* [Logging locations](#logging-locations)
|
||||
* [Redirect HTTP request logging](#redirect-http-access-logging)
|
||||
* [Structured logging](#structured-logging)
|
||||
* [Tracing](#tracing)
|
||||
|
||||
|
||||
InfluxDB writes log output, by default, to `stderr`.
|
||||
Depending on your use case, this log information can be written to another location.
|
||||
Some service managers may override this default.
|
||||
|
||||
## Logging locations
|
||||
|
||||
### Run InfluxDB directly
|
||||
|
||||
If you run InfluxDB directly, using `influxd`, all logs will be written to `stderr`.
|
||||
You may redirect this log output as you would any output to `stderr` like so:
|
||||
|
||||
```bash
|
||||
influxdb-meta 2>$HOME/my_log_file # Meta nodes
|
||||
influxd 2>$HOME/my_log_file # Data nodes
|
||||
influx-enterprise 2>$HOME/my_log_file # Enterprise Web
|
||||
```
|
||||
|
||||
### Launched as a service
|
||||
|
||||
#### sysvinit
|
||||
|
||||
If InfluxDB was installed using a pre-built package, and then launched
|
||||
as a service, `stderr` is redirected to
|
||||
`/var/log/influxdb/<node-type>.log`, and all log data will be written to
|
||||
that file. You can override this location by setting the variable
|
||||
`STDERR` in the file `/etc/default/<node-type>`.
|
||||
|
||||
For example, if on a data node `/etc/default/influxdb` contains:
|
||||
|
||||
```bash
|
||||
STDERR=/dev/null
|
||||
```
|
||||
|
||||
all log data will be discarded. You can similarly direct output to
|
||||
`stdout` by setting `STDOUT` in the same file. Output to `stdout` is
|
||||
sent to `/dev/null` by default when InfluxDB is launched as a service.
|
||||
|
||||
InfluxDB must be restarted to pick up any changes to `/etc/default/<node-type>`.
|
||||
|
||||
|
||||
##### Meta nodes
|
||||
|
||||
For meta nodes, the <node-type> is `influxdb-meta`.
|
||||
The default log file is `/var/log/influxdb/influxdb-meta.log`
|
||||
The service configuration file is `/etc/default/influxdb-meta`.
|
||||
|
||||
##### Data nodes
|
||||
|
||||
For data nodes, the <node-type> is `influxdb`.
|
||||
The default log file is `/var/log/influxdb/influxdb.log`
|
||||
The service configuration file is `/etc/default/influxdb`.
|
||||
|
||||
##### Enterprise Web
|
||||
|
||||
For Enterprise Web nodes, the <node-type> is `influx-enterprise`.
|
||||
The default log file is `/var/log/influxdb/influx-enterprise.log`
|
||||
The service configuration file is `/etc/default/influx-enterprise`.
|
||||
|
||||
#### systemd
|
||||
|
||||
Starting with version 1.0, InfluxDB on systemd systems no longer
|
||||
writes files to `/var/log/<node-type>.log` by default, and now uses the
|
||||
system configured default for logging (usually `journald`). On most
|
||||
systems, the logs will be directed to the systemd journal and can be
|
||||
accessed with the command:
|
||||
|
||||
```
|
||||
sudo journalctl -u <node-type>.service
|
||||
```
|
||||
|
||||
Please consult the systemd journald documentation for configuring
|
||||
journald.
|
||||
|
||||
##### Meta nodes
|
||||
|
||||
For data nodes the <node-type> is `influxdb-meta`.
|
||||
The default log command is `sudo journalctl -u influxdb-meta.service`
|
||||
The service configuration file is `/etc/default/influxdb-meta`.
|
||||
|
||||
##### Data nodes
|
||||
|
||||
For data nodes the <node-type> is `influxdb`.
|
||||
The default log command is `sudo journalctl -u influxdb.service`
|
||||
The service configuration file is `/etc/default/influxdb`.
|
||||
|
||||
##### Enterprise Web
|
||||
|
||||
For data nodes the <node-type> is `influx-enterprise`.
|
||||
The default log command is `sudo journalctl -u influx-enterprise.service`
|
||||
The service configuration file is `/etc/default/influx-enterprise`.
|
||||
|
||||
### Use logrotate
|
||||
|
||||
You can use [logrotate](http://manpages.ubuntu.com/manpages/cosmic/en/man8/logrotate.8.html)
|
||||
to rotate the log files generated by InfluxDB on systems where logs are written to flat files.
|
||||
If using the package install on a sysvinit system, the config file for logrotate is installed in `/etc/logrotate.d`.
|
||||
You can view the file [here](https://github.com/influxdb/influxdb/blob/master/scripts/logrotate).
|
||||
|
||||
## Redirect HTTP access logging
|
||||
|
||||
InfluxDB 1.5 introduces the option to log HTTP request traffic separately from the other InfluxDB log output. When HTTP request logging is enabled, the HTTP logs are intermingled by default with internal InfluxDB logging. By redirecting the HTTP request log entries to a separate file, both log files are easier to read, monitor, and debug.
|
||||
|
||||
See [Redirecting HTTP request logging](/enterprise_influxdb/v1.9/administration/logs/#redirecting-http-access-logging) in the InfluxDB OSS documentation.
|
||||
|
||||
## Structured logging
|
||||
|
||||
With InfluxDB 1.5, structured logging is supported and enable machine-readable and more developer-friendly log output formats. The two new structured log formats, `logfmt` and `json`, provide easier filtering and searching with external tools and simplifies integration of InfluxDB logs with Splunk, Papertrail, Elasticsearch, and other third party tools.
|
||||
|
||||
See [Structured logging](/enterprise_influxdb/v1.9/administration/logs/#structured-logging) in the InfluxDB OSS documentation.
|
||||
|
||||
## Tracing
|
||||
|
||||
Logging has been enhanced, starting in InfluxDB 1.5, to provide tracing of important InfluxDB operations. Tracing is useful for error reporting and discovering performance bottlenecks.
|
||||
|
||||
See [Tracing](/enterprise_influxdb/v1.9/administration/logs/#tracing) in the InfluxDB OSS documentation.
|
|
@ -0,0 +1,91 @@
|
|||
---
|
||||
title: TCP and UDP ports used in InfluxDB Enterprise
|
||||
description: Configure TCP and UDP ports in InfluxDB Enterprise.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Configure TCP and UDP Ports
|
||||
weight: 120
|
||||
parent: Administration
|
||||
---
|
||||
|
||||

|
||||
|
||||
## Enabled ports
|
||||
|
||||
### 8086
|
||||
|
||||
The default port that runs the InfluxDB HTTP service.
|
||||
It is used for the primary public write and query API.
|
||||
Clients include the CLI, Chronograf, InfluxDB client libraries, Grafana, curl, or anything that wants to write and read time series data to and from InfluxDB.
|
||||
[Configure this port](/enterprise_influxdb/v1.9/administration/config-data-nodes/#bind-address-8088)
|
||||
in the data node configuration file.
|
||||
|
||||
_See also: [API Reference](/enterprise_influxdb/v1.9/tools/api/)._
|
||||
|
||||
### 8088
|
||||
|
||||
Data nodes listen on this port.
|
||||
Primarily used by other data nodes to handle distributed reads and writes at runtime.
|
||||
Used to control a data node (e.g., tell it to write to a specific shard or execute a query).
|
||||
It's also used by meta nodes for cluster-type operations (e.g., tell a data node to join or leave the cluster).
|
||||
|
||||
This is the default port used for RPC calls used for inter-node communication and by the CLI for backup and restore operations
|
||||
(`influxdb backup` and `influxd restore`).
|
||||
[Configure this port](/enterprise_influxdb/v1.9/administration/config#bind-address-127-0-0-1-8088)
|
||||
in the configuration file.
|
||||
|
||||
This port should not be exposed outside the cluster.
|
||||
|
||||
_See also: [Backup and Restore](/enterprise_influxdb/v1.9/administration/backup_and_restore/)._
|
||||
|
||||
### 8089
|
||||
|
||||
Used for communcation between meta nodes.
|
||||
It is used by the Raft consensus protocol.
|
||||
The only clients using `8089` should be the other meta nodes in the cluster.
|
||||
|
||||
This port should not be exposed outside the cluster.
|
||||
|
||||
### 8091
|
||||
|
||||
Meta nodes listen on this port.
|
||||
It is used for the meta service API.
|
||||
Primarily used by data nodes to stay in sync about databases, retention policies, shards, users, privileges, etc.
|
||||
Used by meta nodes to receive incoming connections by data nodes and Chronograf.
|
||||
Clients also include the `influxd-ctl` command line tool and Chronograph,
|
||||
|
||||
This port should not be exposed outside the cluster.
|
||||
|
||||
## Disabled ports
|
||||
|
||||
### 2003
|
||||
|
||||
The default port that runs the Graphite service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.9/administration/config#bind-address-2003)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [Graphite README](https://github.com/influxdata/influxdb/tree/1.8/services/graphite/README.md)
|
||||
|
||||
### 4242
|
||||
|
||||
The default port that runs the OpenTSDB service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.9/administration/config#bind-address-4242)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [OpenTSDB README](https://github.com/influxdata/influxdb/tree/1.8/services/opentsdb/README.md)
|
||||
|
||||
### 8089
|
||||
|
||||
The default port that runs the UDP service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.9/administration/config#bind-address-8089)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [UDP README](https://github.com/influxdata/influxdb/tree/1.8/services/udp/README.md)
|
||||
|
||||
### 25826
|
||||
|
||||
The default port that runs the Collectd service.
|
||||
[Enable and configure this port](/enterprise_influxdb/v1.9/administration/config#bind-address-25826)
|
||||
in the configuration file.
|
||||
|
||||
**Resources** [Collectd README](https://github.com/influxdata/influxdb/tree/1.8/services/collectd/README.md)
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
title: Rename hosts in InfluxDB Enterprise
|
||||
description: Rename a host within your InfluxDB Enterprise instance.
|
||||
aliases:
|
||||
- /enterprise/v1.8/administration/renaming/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Rename hosts
|
||||
weight: 100
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
## Host renaming
|
||||
|
||||
The following instructions allow you to rename a host within your InfluxDB Enterprise instance.
|
||||
|
||||
First, suspend write and query activity to the cluster.
|
||||
|
||||
### Rename meta nodes
|
||||
|
||||
- Find the meta node leader with `curl localhost:8091/status`. The `leader` field in the JSON output reports the leader meta node. We will start with the two meta nodes that are not leaders.
|
||||
- On a non-leader meta node, run `influxd-ctl remove-meta`. Once removed, confirm by running `influxd-ctl show` on the meta leader.
|
||||
- Stop the meta service on the removed node, edit its configuration file to set the new "hostname" under "/etc/influxdb/influxdb-meta.conf".
|
||||
- Update the actual OS host's name if needed, apply DNS changes.
|
||||
- Start the meta service.
|
||||
- On the meta leader, add the meta node with the new hostname using `influxd-ctl add-meta newmetanode:8091`. Confirm with `influxd-ctl show`
|
||||
- Repeat for the second meta node.
|
||||
- Once the two non-leaders are updated, stop the leader and wait for another meta node to become the leader - check with `curl localhost:8091/status`.
|
||||
- Repeat the process for the last meta node (former leader).
|
||||
|
||||
### Intermediate verification
|
||||
|
||||
- Verify the state of the cluster with `influxd-ctl show`. The version must be reported on all nodes for them to be healthy.
|
||||
- Verify there is a meta leader with `curl localhost:8091/status` and that all meta nodes list the rest in the output.
|
||||
- Restart all data nodes one by one. Verify that `/var/lib/influxdb/meta/client.json` on all data nodes references the new meta names.
|
||||
- Verify the `show shards` output lists all shards and node ownership as expected.
|
||||
- Verify that the cluster is in good shape functional-wise, responds to writes and queries.
|
||||
|
||||
### Rename data nodes
|
||||
|
||||
- Find the meta node leader with `curl localhost:8091/status`. The `leader` field in the JSON output reports the leader meta node.
|
||||
- Stop the service on the data node you want to rename. Edit its configuration file to set the new `hostname` under `/etc/influxdb/influxdb.conf`.
|
||||
- Update the actual OS host's name if needed, apply DNS changes.
|
||||
- Start the data service. Errors will be logged until it is added to the cluster again.
|
||||
- On the meta node leader, run `influxd-ctl update-data oldname:8088 newname:8088`. Upon success you will get a message updated data node ID to `newname:8088`.
|
||||
- Verify with `influxd-ctl show` on the meta node leader. Verify there are no errors in the logs of the updated data node and other data nodes. Restart the service on the updated data node. Verify writes, replication and queries work as expected.
|
||||
- Repeat on the remaining data nodes. Remember to only execute the `update-data` command from the meta leader.
|
||||
|
||||
### Final verification
|
||||
|
||||
- Verify the state of the cluster with `influxd-ctl show`. The version must be reported on all nodes for them to be healthy.
|
||||
- Verify the `show shards` output lists all shards and node ownership as expected.
|
||||
- Verify meta queries work (show measurements under a database).
|
||||
- Verify data are being queried successfully.
|
||||
|
||||
Once you've performed the verification steps, resume write and query activity.
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
title: Renew or update a license key or file
|
||||
description: >
|
||||
Renew or update a license key or file for your InfluxDB enterprise cluster.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Renew a license
|
||||
weight: 50
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
Use this procedure to renew or update an existing license key or file, switch from a license key to a license file, or switch from a license file to a license key.
|
||||
|
||||
> **Note:** To request a new license to renew or expand your InfluxDB Enterprise cluster, contact [sales@influxdb.com](mailto:sales@influxdb.com).
|
||||
|
||||
To update a license key or file, do the following:
|
||||
|
||||
1. If you are switching from a license key to a license file (or vice versa), delete your existing license key or file.
|
||||
2. **Add the license key or file** to your [meta nodes](/enterprise_influxdb/v1.9/administration/config-meta-nodes/#enterprise-license-settings) and [data nodes](/enterprise_influxdb/v1.9/administration/config-data-nodes/#enterprise-license-settings) configuration settings. For more information, see [how to configure InfluxDB Enterprise clusters](/enterprise_influxdb/v1.9/administration/configuration/).
|
||||
3. **On each meta node**, run `service influxdb-meta restart`, and wait for the meta node service to come back up successfully before restarting the next meta node.
|
||||
The cluster should remain unaffected as long as only one node is restarting at a time.
|
||||
4. **On each data node**, run `killall -s HUP influxd` to signal the `influxd` process to reload its configuration file.
|
|
@ -0,0 +1,53 @@
|
|||
---
|
||||
title: Manage security in InfluxDB Enterprise
|
||||
description: Protect the data in your InfluxDB Enterprise instance.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Manage security
|
||||
weight: 110
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
Some customers may choose to install InfluxDB Enterprise with public internet access, however doing so can inadvertently expose your data and invite unwelcome attacks on your database.
|
||||
Check out the sections below for how protect the data in your InfluxDB Enterprise instance.
|
||||
|
||||
## Enable authentication
|
||||
|
||||
Password protect your InfluxDB Enterprise instance to keep any unauthorized individuals
|
||||
from accessing your data.
|
||||
|
||||
Resources:
|
||||
[Set up Authentication](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#set-up-authentication)
|
||||
|
||||
## Manage users and permissions
|
||||
|
||||
Restrict access by creating individual users and assigning them relevant
|
||||
read and/or write permissions.
|
||||
|
||||
Resources:
|
||||
[User types and privileges](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#user-types-and-privileges),
|
||||
[User management commands](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#user-management-commands),
|
||||
[Fine-grained authorization](/enterprise_influxdb/v1.9/guides/fine-grained-authorization/)
|
||||
|
||||
## Enable HTTPS
|
||||
|
||||
Using HTTPS secures the communication between clients and the InfluxDB server, and, in
|
||||
some cases, HTTPS verifies the authenticity of the InfluxDB server to clients (bi-directional authentication).
|
||||
The communicatio between the meta nodes and the data nodes are also secured via HTTPS.
|
||||
|
||||
Resources:
|
||||
[Enabling HTTPS](/enterprise_influxdb/v1.9/guides/https_setup/)
|
||||
|
||||
## Secure your host
|
||||
|
||||
### Ports
|
||||
|
||||
For InfluxDB Enterprise data nodes, close all ports on each host except for port `8086`.
|
||||
You can also use a proxy to port `8086`. By default, data nodes and meta nodes communicate with each other over '8088','8089',and'8091'
|
||||
|
||||
For InfluxDB Enterprise, [backing up and restoring](/enterprise_influxdb/v1.9/administration/backup-and-restore/) is performed from the meta nodes.
|
||||
|
||||
|
||||
### AWS Recommendations
|
||||
|
||||
InfluxData recommends implementing on-disk encryption; InfluxDB does not offer built-in support to encrypt the data.
|
|
@ -0,0 +1,167 @@
|
|||
---
|
||||
title: Monitor InfluxDB servers
|
||||
description: Troubleshoot and monitor InfluxDB OSS.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/administration/statistics/
|
||||
- /enterprise_influxdb/v1.9/troubleshooting/statistics/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Monitor InfluxDB
|
||||
weight: 80
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
**On this page**
|
||||
|
||||
* [SHOW STATS](#show-stats)
|
||||
* [SHOW DIAGNOSTICS](#show-diagnostics)
|
||||
* [Internal monitoring](#internal-monitoring)
|
||||
* [Useful performance metrics commands](#useful-performance-metrics-commands)
|
||||
* [InfluxDB `/metrics` HTTP endpoint](#influxdb-metrics-http-endpoint)
|
||||
|
||||
|
||||
InfluxDB can display statistical and diagnostic information about each node.
|
||||
This information can be very useful for troubleshooting and performance monitoring.
|
||||
|
||||
## SHOW STATS
|
||||
|
||||
To see node statistics, execute the command `SHOW STATS`.
|
||||
For details on this command, see [`SHOW STATS`](/enterprise_influxdb/v1.9/query_language/spec#show-stats) in the InfluxQL specification.
|
||||
|
||||
The statistics returned by `SHOW STATS` are stored in memory only, and are reset to zero when the node is restarted.
|
||||
|
||||
## SHOW DIAGNOSTICS
|
||||
|
||||
To see node diagnostic information, execute the command `SHOW DIAGNOSTICS`.
|
||||
This returns information such as build information, uptime, hostname, server configuration, memory usage, and Go runtime diagnostics.
|
||||
For details on this command, see [`SHOW DIAGNOSTICS`](/enterprise_influxdb/v1.9/query_language/spec#show-diagnostics) in the InfluxQL specification.
|
||||
|
||||
## Internal monitoring
|
||||
InfluxDB also writes statistical and diagnostic information to database named `_internal`, which records metrics on the internal runtime and service performance.
|
||||
The `_internal` database can be queried and manipulated like any other InfluxDB database.
|
||||
Check out the [monitor service README](https://github.com/influxdata/influxdb/blob/1.8/monitor/README.md) and the [internal monitoring blog post](https://www.influxdata.com/blog/how-to-use-the-show-stats-command-and-the-_internal-database-to-monitor-influxdb/) for more detail.
|
||||
|
||||
## Useful performance metrics commands
|
||||
|
||||
Below are a collection of commands to find useful performance metrics about your InfluxDB instance.
|
||||
|
||||
To find the number of points per second being written to the instance. Must have the `monitor` service enabled:
|
||||
```bash
|
||||
$ influx -execute 'select derivative(pointReq, 1s) from "write" where time > now() - 5m' -database '_internal' -precision 'rfc3339'
|
||||
```
|
||||
|
||||
To find the number of writes separated by database since the beginnning of the log file:
|
||||
|
||||
```bash
|
||||
grep 'POST' /var/log/influxdb/influxd.log | awk '{ print $10 }' | sort | uniq -c
|
||||
```
|
||||
|
||||
Or, for systemd systems logging to journald:
|
||||
|
||||
```bash
|
||||
journalctl -u influxdb.service | awk '/POST/ { print $10 }' | sort | uniq -c
|
||||
```
|
||||
|
||||
### InfluxDB `/metrics` HTTP endpoint
|
||||
|
||||
> ***Note:*** There are no outstanding PRs for improvements to the `/metrics` endpoint, but we’ll add them to the CHANGELOG as they occur.
|
||||
|
||||
The InfluxDB `/metrics` endpoint is configured to produce the default Go metrics in Prometheus metrics format.
|
||||
|
||||
|
||||
#### Example using InfluxDB `/metrics' endpoint
|
||||
|
||||
Below is an example of the output generated using the `/metrics` endpoint. Note that HELP is available to explain the Go statistics.
|
||||
|
||||
```
|
||||
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
|
||||
# TYPE go_gc_duration_seconds summary
|
||||
go_gc_duration_seconds{quantile="0"} 6.4134e-05
|
||||
go_gc_duration_seconds{quantile="0.25"} 8.8391e-05
|
||||
go_gc_duration_seconds{quantile="0.5"} 0.000131335
|
||||
go_gc_duration_seconds{quantile="0.75"} 0.000169204
|
||||
go_gc_duration_seconds{quantile="1"} 0.000544705
|
||||
go_gc_duration_seconds_sum 0.004619405
|
||||
go_gc_duration_seconds_count 27
|
||||
# HELP go_goroutines Number of goroutines that currently exist.
|
||||
# TYPE go_goroutines gauge
|
||||
go_goroutines 29
|
||||
# HELP go_info Information about the Go environment.
|
||||
# TYPE go_info gauge
|
||||
go_info{version="go1.10"} 1
|
||||
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
|
||||
# TYPE go_memstats_alloc_bytes gauge
|
||||
go_memstats_alloc_bytes 1.581062048e+09
|
||||
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
|
||||
# TYPE go_memstats_alloc_bytes_total counter
|
||||
go_memstats_alloc_bytes_total 2.808293616e+09
|
||||
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
|
||||
# TYPE go_memstats_buck_hash_sys_bytes gauge
|
||||
go_memstats_buck_hash_sys_bytes 1.494326e+06
|
||||
# HELP go_memstats_frees_total Total number of frees.
|
||||
# TYPE go_memstats_frees_total counter
|
||||
go_memstats_frees_total 1.1279913e+07
|
||||
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
|
||||
# TYPE go_memstats_gc_cpu_fraction gauge
|
||||
go_memstats_gc_cpu_fraction -0.00014404354379774563
|
||||
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
|
||||
# TYPE go_memstats_gc_sys_bytes gauge
|
||||
go_memstats_gc_sys_bytes 6.0936192e+07
|
||||
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
|
||||
# TYPE go_memstats_heap_alloc_bytes gauge
|
||||
go_memstats_heap_alloc_bytes 1.581062048e+09
|
||||
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
|
||||
# TYPE go_memstats_heap_idle_bytes gauge
|
||||
go_memstats_heap_idle_bytes 3.8551552e+07
|
||||
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
|
||||
# TYPE go_memstats_heap_inuse_bytes gauge
|
||||
go_memstats_heap_inuse_bytes 1.590673408e+09
|
||||
# HELP go_memstats_heap_objects Number of allocated objects.
|
||||
# TYPE go_memstats_heap_objects gauge
|
||||
go_memstats_heap_objects 1.6924595e+07
|
||||
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
|
||||
# TYPE go_memstats_heap_released_bytes gauge
|
||||
go_memstats_heap_released_bytes 0
|
||||
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
|
||||
# TYPE go_memstats_heap_sys_bytes gauge
|
||||
go_memstats_heap_sys_bytes 1.62922496e+09
|
||||
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
|
||||
# TYPE go_memstats_last_gc_time_seconds gauge
|
||||
go_memstats_last_gc_time_seconds 1.520291233297057e+09
|
||||
# HELP go_memstats_lookups_total Total number of pointer lookups.
|
||||
# TYPE go_memstats_lookups_total counter
|
||||
go_memstats_lookups_total 397
|
||||
# HELP go_memstats_mallocs_total Total number of mallocs.
|
||||
# TYPE go_memstats_mallocs_total counter
|
||||
go_memstats_mallocs_total 2.8204508e+07
|
||||
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
|
||||
# TYPE go_memstats_mcache_inuse_bytes gauge
|
||||
go_memstats_mcache_inuse_bytes 13888
|
||||
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
|
||||
# TYPE go_memstats_mcache_sys_bytes gauge
|
||||
go_memstats_mcache_sys_bytes 16384
|
||||
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
|
||||
# TYPE go_memstats_mspan_inuse_bytes gauge
|
||||
go_memstats_mspan_inuse_bytes 1.4781696e+07
|
||||
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
|
||||
# TYPE go_memstats_mspan_sys_bytes gauge
|
||||
go_memstats_mspan_sys_bytes 1.4893056e+07
|
||||
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
|
||||
# TYPE go_memstats_next_gc_bytes gauge
|
||||
go_memstats_next_gc_bytes 2.38107752e+09
|
||||
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
|
||||
# TYPE go_memstats_other_sys_bytes gauge
|
||||
go_memstats_other_sys_bytes 4.366786e+06
|
||||
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
|
||||
# TYPE go_memstats_stack_inuse_bytes gauge
|
||||
go_memstats_stack_inuse_bytes 983040
|
||||
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
|
||||
# TYPE go_memstats_stack_sys_bytes gauge
|
||||
go_memstats_stack_sys_bytes 983040
|
||||
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
|
||||
# TYPE go_memstats_sys_bytes gauge
|
||||
go_memstats_sys_bytes 1.711914744e+09
|
||||
# HELP go_threads Number of OS threads created.
|
||||
# TYPE go_threads gauge
|
||||
go_threads 16
|
||||
```
|
|
@ -0,0 +1,29 @@
|
|||
---
|
||||
title: Stability and compatibility
|
||||
description: >
|
||||
API and storage engine compatibility and stability in InfluxDB OSS.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 90
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
## 1.x API compatibility and stability
|
||||
|
||||
One of the more important aspects of the 1.0 release is that this marks the stabilization of our API and storage format. Over the course of the last three years we’ve iterated aggressively, often breaking the API in the process. With the release of 1.0 and for the entire 1.x line of releases we’re committing to the following:
|
||||
|
||||
### No breaking InfluxDB API changes
|
||||
|
||||
When it comes to the InfluxDB API, if a command works in 1.0 it will work unchanged in all 1.x releases...with one caveat. We will be adding [keywords](/enterprise_influxdb/v1.9/query_language/spec/#keywords) to the query language. New keywords won't break your queries if you wrap all [identifiers](/enterprise_influxdb/v1.9/concepts/glossary/#identifier) in double quotes and all string literals in single quotes. This is generally considered best practice so it should be followed anyway. For users following that guideline, the query and ingestion APIs will have no breaking changes for all 1.x releases. Note that this does not include the Go code in the project. The underlying Go API in InfluxDB can and will change over the course of 1.x development. Users should be accessing InfluxDB through the [InfluxDB API](/enterprise_influxdb/v1.9/tools/api/).
|
||||
|
||||
### Storage engine stability
|
||||
|
||||
The [TSM](/enterprise_influxdb/v1.9/concepts/glossary/#tsm-time-structured-merge-tree) storage engine file format is now at version 1. While we may introduce new versions of the format in the 1.x releases, these new versions will run side-by-side with previous versions. What this means for users is there will be no lengthy migrations when upgrading from one 1.x release to another.
|
||||
|
||||
### Additive changes
|
||||
|
||||
The query engine will have additive changes over the course of the new releases. We’ll introduce new query functions and new functionality into the language without breaking backwards compatibility. We may introduce new protocol endpoints (like a binary format) and versions of the line protocol and query API to improve performance and/or functionality, but they will have to run in parallel with the existing versions. Existing versions will be supported for the entirety of the 1.x release line.
|
||||
|
||||
### Ongoing support
|
||||
|
||||
We’ll continue to fix bugs on the 1.x versions of the [line protocol](/enterprise_influxdb/v1.9/concepts/glossary/#influxdb-line-protocol), query API, and TSM storage format. Users should expect to upgrade to the latest 1.x.x release for bug fixes, but those releases will all be compatible with the 1.0 API and won’t require data migrations. For instance, if a user is running 1.2 and there are bug fixes released in 1.3, they should upgrade to the 1.3 release. Until 1.4 is released, patch fixes will go into 1.3.x. Because all future 1.x releases are drop in replacements for previous 1.x releases, users should upgrade to the latest in the 1.x line to get all bug fixes.
|
|
@ -0,0 +1,208 @@
|
|||
---
|
||||
title: Manage subscriptions in InfluxDB
|
||||
description: >
|
||||
Manage subscriptions, which copy all written data to a local or remote endpoint, in InfluxDB OSS.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Administration
|
||||
name: Manage subscriptions
|
||||
weight: 100
|
||||
---
|
||||
|
||||
InfluxDB subscriptions are local or remote endpoints to which all data written to InfluxDB is copied.
|
||||
Subscriptions are primarily used with [Kapacitor](/kapacitor/), but any endpoint
|
||||
able to accept UDP, HTTP, or HTTPS connections can subscribe to InfluxDB and receive
|
||||
a copy of all data as it is written.
|
||||
|
||||
## How subscriptions work
|
||||
|
||||
As data is written to InfluxDB, writes are duplicated to subscriber endpoints via
|
||||
HTTP, HTTPS, or UDP in [line protocol](/enterprise_influxdb/v1.9/write_protocols/line_protocol_tutorial/).
|
||||
the InfluxDB subscriber service creates multiple "writers" ([goroutines](https://golangbot.com/goroutines/))
|
||||
which send writes to the subscription endpoints.
|
||||
|
||||
_The number of writer goroutines is defined by the [`write-concurrency`](/enterprise_influxdb/v1.9/administration/config#write-concurrency-40) configuration._
|
||||
|
||||
As writes occur in InfluxDB, each subscription writer sends the written data to the
|
||||
specified subscription endpoints.
|
||||
However, with a high `write-concurrency` (multiple writers) and a high ingest rate,
|
||||
nanosecond differences in writer processes and the transport layer can result
|
||||
in writes being received out of order.
|
||||
|
||||
> #### Important information about high write loads
|
||||
> While setting the subscriber `write-concurrency` to greater than 1 does increase your
|
||||
> subscriber write throughput, it can result in out-of-order writes under high ingest rates.
|
||||
> Setting `write-concurrency` to 1 ensures writes are passed to subscriber endpoints sequentially,
|
||||
> but can create a bottleneck under high ingest rates.
|
||||
>
|
||||
> What `write-concurrency` should be set to depends on your specific workload
|
||||
> and need for in-order writes to your subscription endpoint.
|
||||
|
||||
## InfluxQL subscription statements
|
||||
|
||||
Use the following InfluxQL statements to manage subscriptions:
|
||||
|
||||
[`CREATE SUBSCRIPTION`](#create-subscriptions)
|
||||
[`SHOW SUBSCRIPTIONS`](#show-subscriptions)
|
||||
[`DROP SUBSCRIPTION`](#remove-subscriptions)
|
||||
|
||||
## Create subscriptions
|
||||
|
||||
Create subscriptions using the `CREATE SUBSCRIPTION` InfluxQL statement.
|
||||
Specify the subscription name, the database name and retention policy to subscribe to,
|
||||
and the URL of the host to which data written to InfluxDB should be copied.
|
||||
|
||||
```sql
|
||||
-- Pattern:
|
||||
CREATE SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>" DESTINATIONS <ALL|ANY> "<subscription_endpoint_host>"
|
||||
|
||||
-- Examples:
|
||||
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to 'example.com:9090' via HTTP.
|
||||
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://example.com:9090'
|
||||
|
||||
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that round-robins the data to 'h1.example.com:9090' and 'h2.example.com:9090' via UDP.
|
||||
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ANY 'udp://h1.example.com:9090', 'udp://h2.example.com:9090'
|
||||
```
|
||||
In case authentication is enabled on the subscriber host, adapt the URL to contain the credentials.
|
||||
|
||||
```
|
||||
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to another InfluxDB on 'example.com:8086' via HTTP. Authentication is enabled on the subscription host (user: subscriber, pass: secret).
|
||||
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://subscriber:secret@example.com:8086'
|
||||
```
|
||||
|
||||
{{% warn %}}
|
||||
`SHOW SUBSCRIPTIONS` outputs all subscriber URL in plain text, including those with authentication credentials.
|
||||
Any user with the privileges to run `SHOW SUBSCRIPTIONS` is able to see these credentials.
|
||||
{{% /warn %}}
|
||||
|
||||
### Sending subscription data to multiple hosts
|
||||
|
||||
The `CREATE SUBSCRIPTION` statement allows you to specify multiple hosts as endpoints for the subscription.
|
||||
In your `DESTINATIONS` clause, you can pass multiple host strings separated by commas.
|
||||
Using `ALL` or `ANY` in the `DESTINATIONS` clause determines how InfluxDB writes data to each endpoint:
|
||||
|
||||
`ALL`: Writes data to all specified hosts.
|
||||
|
||||
`ANY`: Round-robins writes between specified hosts.
|
||||
|
||||
_**Subscriptions with multiple hosts**_
|
||||
|
||||
```sql
|
||||
-- Write all data to multiple hosts
|
||||
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ALL 'http://host1.example.com:9090', 'http://host2.example.com:9090'
|
||||
|
||||
-- Round-robin writes between multiple hosts
|
||||
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ANY 'http://host1.example.com:9090', 'http://host2.example.com:9090'
|
||||
```
|
||||
|
||||
### Subscription protocols
|
||||
|
||||
Subscriptions can use HTTP, HTTPS, or UDP transport protocols.
|
||||
Which to use is determined by the protocol expected by the subscription endpoint.
|
||||
If creating a Kapacitor subscription, this is defined by the `subscription-protocol`
|
||||
option in the `[[influxdb]]` section of your [`kapacitor.conf`](/{{< latest "kapacitor" >}}/administration/subscription-management/#subscription-protocol).
|
||||
|
||||
_**kapacitor.conf**_
|
||||
|
||||
```toml
|
||||
[[influxdb]]
|
||||
|
||||
# ...
|
||||
|
||||
subscription-protocol = "http"
|
||||
|
||||
# ...
|
||||
|
||||
```
|
||||
|
||||
_For information regarding HTTPS connections and secure communication between InfluxDB and Kapacitor,
|
||||
view the [Kapacitor security](/kapacitor/v1.5/administration/security/#secure-influxdb-and-kapacitor) documentation._
|
||||
|
||||
## Show subscriptions
|
||||
|
||||
The `SHOW SUBSCRIPTIONS` InfluxQL statement returns a list of all subscriptions registered in InfluxDB.
|
||||
|
||||
```sql
|
||||
SHOW SUBSCRIPTIONS
|
||||
```
|
||||
|
||||
_**Example output:**_
|
||||
|
||||
```bash
|
||||
name: _internal
|
||||
retention_policy name mode destinations
|
||||
---------------- ---- ---- ------------
|
||||
monitor kapacitor-39545771-7b64-4692-ab8f-1796c07f3314 ANY [http://localhost:9092]
|
||||
```
|
||||
|
||||
## Remove subscriptions
|
||||
|
||||
Remove or drop subscriptions using the `DROP SUBSCRIPTION` InfluxQL statement.
|
||||
|
||||
```sql
|
||||
-- Pattern:
|
||||
DROP SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>"
|
||||
|
||||
-- Example:
|
||||
DROP SUBSCRIPTION "sub0" ON "mydb"."autogen"
|
||||
```
|
||||
|
||||
### Drop all subscriptions
|
||||
|
||||
In some cases, it may be necessary to remove all subscriptions.
|
||||
Run the following bash script that utilizes the `influx` CLI, loops through all subscriptions, and removes them.
|
||||
This script depends on the `$INFLUXUSER` and `$INFLUXPASS` environment variables.
|
||||
If these are not set, export them as part of the script.
|
||||
|
||||
```bash
|
||||
# Environment variable exports:
|
||||
# Uncomment these if INFLUXUSER and INFLUXPASS are not already globally set.
|
||||
# export INFLUXUSER=influxdb-username
|
||||
# export INFLUXPASS=influxdb-password
|
||||
|
||||
IFS=$'\n'; for i in $(influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute 'show subscriptions' | tail -n +2 | grep -v name); do influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute "drop subscription \"$(echo "$i" | cut -f 3 -d ',')\" ON \"$(echo "$i" | cut -f 1 -d ',')\".\"$(echo "$i" | cut -f 2 -d ',')\""; done
|
||||
```
|
||||
|
||||
## Configure InfluxDB subscriptions
|
||||
|
||||
InfluxDB subscription configuration options are available in the `[subscriber]`
|
||||
section of the `influxdb.conf`.
|
||||
In order to use subcriptions, the `enabled` option in the `[subscriber]` section must be set to `true`.
|
||||
Below is an example `influxdb.conf` subscriber configuration:
|
||||
|
||||
```toml
|
||||
[subscriber]
|
||||
enabled = true
|
||||
http-timeout = "30s"
|
||||
insecure-skip-verify = false
|
||||
ca-certs = ""
|
||||
write-concurrency = 40
|
||||
write-buffer-size = 1000
|
||||
```
|
||||
|
||||
_**Descriptions of `[subscriber]` configuration options are available in the [Configuring InfluxDB](/enterprise_influxdb/v1.9/administration/config#subscription-settings) documentation.**_
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Inaccessible or decommissioned subscription endpoints
|
||||
|
||||
Unless a subscription is [dropped](#remove-subscriptions), InfluxDB assumes the endpoint
|
||||
should always receive data and will continue to attempt to send data.
|
||||
If an endpoint host is inaccessible or has been decommissioned, you will see errors
|
||||
similar to the following:
|
||||
|
||||
```bash
|
||||
# Some message content omitted (...) for the sake of brevity
|
||||
"Post http://x.y.z.a:9092/write?consistency=...: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" ... service=subscriber
|
||||
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp x.y.z.a:9092: getsockopt: connection refused" ... service=subscriber
|
||||
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp 172.31.36.5:9092: getsockopt: no route to host" ... service=subscriber
|
||||
```
|
||||
|
||||
In some cases, this may be caused by a networking error or something similar
|
||||
preventing a successful connection to the subscription endpoint.
|
||||
In other cases, it's because the subscription endpoint no longer exists and
|
||||
the subscription hasn't been dropped from InfluxDB.
|
||||
|
||||
> Because InfluxDB does not know if a subscription endpoint will or will not become accessible again,
|
||||
> subscriptions are not automatically dropped when an endpoint becomes inaccessible.
|
||||
> If a subscription endpoint is removed, you must manually [drop the subscription](#remove-subscriptions) from InfluxDB.
|
|
@ -0,0 +1,245 @@
|
|||
---
|
||||
title: Upgrade InfluxDB Enterprise clusters
|
||||
description: Upgrade to the latest version of InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/administration/upgrading/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Upgrade
|
||||
weight: 50
|
||||
parent: Administration
|
||||
---
|
||||
|
||||
To successfully perform a rolling upgrade of InfluxDB Enterprise clusters to {{< latest-patch >}}, complete the following steps:
|
||||
|
||||
1. [Back up your cluster](#back-up-your-cluster).
|
||||
2. [Upgrade meta nodes](#upgrade-meta-nodes).
|
||||
3. [Upgrade data nodes](#upgrade-data-nodes).
|
||||
|
||||
> ***Note:*** A rolling upgrade lets you update your cluster with zero downtime. To downgrade to an earlier version, complete the following procedures, replacing the version numbers with the version that you want to downgrade to.
|
||||
|
||||
## Back up your cluster
|
||||
|
||||
Before performing an upgrade, create a full backup of your InfluxDB Enterprise cluster. Also, if you create incremental backups, trigger a final incremental backup.
|
||||
|
||||
> ***Note:*** For information on performing a final incremental backup or a full backup,
|
||||
> see [Back up and restore InfluxDB Enterprise clusters](/enterprise_influxdb/v1.9/administration/backup-and-restore/).
|
||||
|
||||
## Upgrade meta nodes
|
||||
|
||||
Complete the following steps to upgrade meta nodes:
|
||||
|
||||
1. [Download the meta node package](#download-the-meta-node-package).
|
||||
2. [Install the meta node package](#install-the-meta-node-package).
|
||||
3. [Update the meta node configuration file](#update-the-meta-node-configuration-file).
|
||||
4. [Restart the `influxdb-meta` service](#restart-the-influxdb-meta-service).
|
||||
5. Repeat steps 1-4 for each meta node in your cluster.
|
||||
6. [Confirm the meta nodes upgrade](#confirm-the-meta-nodes-upgrade).
|
||||
|
||||
### Download the meta node package
|
||||
|
||||
##### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat and CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Install the meta node package
|
||||
|
||||
##### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
sudo dpkg -i influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat and CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
sudo yum localinstall influxdb-meta-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Update the meta node configuration file
|
||||
|
||||
Migrate any custom settings from your previous meta node configuration file.
|
||||
|
||||
To enable HTTPS, you must update the meta node configuration file (`influxdb-meta.conf`). For information, see [Enable HTTPS within the configuration file for each Meta Node](/enterprise_influxdb/v1.9/guides/https_setup/#set-up-https-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
### Restart the `influxdb-meta` service
|
||||
|
||||
##### sysvinit systems
|
||||
|
||||
```bash
|
||||
service influxdb-meta restart
|
||||
```
|
||||
|
||||
##### systemd systems
|
||||
|
||||
```bash
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
|
||||
### Confirm the meta nodes upgrade
|
||||
|
||||
After upgrading _**all**_ meta nodes, check your node version numbers using the
|
||||
`influxd-ctl show` command.
|
||||
The [`influxd-ctl` utility](/enterprise_influxdb/v1.9/tools/influxd-ctl/) is available on all meta nodes.
|
||||
|
||||
```bash
|
||||
~# influxd-ctl show
|
||||
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 rk-upgrading-01:8088 1.8.x_c1.8.y
|
||||
5 rk-upgrading-02:8088 1.8.x_c1.8.y
|
||||
6 rk-upgrading-03:8088 1.8.x_c1.8.y
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
rk-upgrading-01:8091 {{< latest-patch >}}-c{{< latest-patch >}} # {{< latest-patch >}}-c{{< latest-patch >}} = 👍
|
||||
rk-upgrading-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
rk-upgrading-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
```
|
||||
|
||||
Ensure that the meta cluster is healthy before upgrading the data nodes.
|
||||
|
||||
## Upgrade data nodes
|
||||
|
||||
Complete the following steps to upgrade data nodes:
|
||||
|
||||
1. [Download the data node package](#download-the-data-node-package).
|
||||
2. [Stop traffic to data nodes](#stop-traffic-to-the-data-node).
|
||||
3. [Install the data node package](#install-the-data-node-package).
|
||||
4. [Update the data node configuration file](#update-the-data-node-configuration-file).
|
||||
5. For Time Series Index (TSI) only. [Rebuild TSI indexes](#rebuild-tsi-indexes).
|
||||
6. [Restart the `influxdb` service](#restart-the-influxdb-service).
|
||||
7. [Restart traffic to data nodes](#restart-traffic-to-data-nodes).
|
||||
8. Repeat steps 1-7 for each data node in your cluster.
|
||||
9. [Confirm the data nodes upgrade](#confirm-the-data-nodes-upgrade).
|
||||
|
||||
### Download the data node package
|
||||
|
||||
##### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat and CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Stop traffic to the data node
|
||||
|
||||
- If you have access to the load balancer configuration, use your load balancer to stop routing read and write requests to the data node server (port 8086).
|
||||
|
||||
- If you cannot access the load balancer configuration, work with your networking team to prevent traffic to the data node server before continuing to upgrade.
|
||||
|
||||
### Install the data node package
|
||||
|
||||
When you run the install command, you're prompted to keep or overwrite your current configuration file with the file for version {{< latest-patch >}}. Enter `N` or `O` to keep your current configuration file. You'll make the configuration changes for version {{< latest-patch >}}. in the next procedure, [Update the data node configuration file](#update-the-data-node-configuration-file).
|
||||
|
||||
|
||||
##### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
sudo dpkg -i influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat & CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
sudo yum localinstall influxdb-data-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
### Update the data node configuration file
|
||||
|
||||
Migrate any custom settings from your previous data node configuration file.
|
||||
|
||||
- To enable HTTPS, see [Enable HTTPS within the configuration file for each Data Node](/enterprise_influxdb/v1.9/guides/https_setup/#set-up-https-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
- To enable TSI, open `/etc/influxdb/influxdb.conf`, and then adjust and save the settings shown in the following table.
|
||||
|
||||
| Section | Setting |
|
||||
| --------| ----------------------------------------------------------|
|
||||
| `[data]` | <ul><li>To use Time Series Index (TSI) disk-based indexing, add [`index-version = "tsi1"`](/enterprise_influxdb/v1.9/administration/config-data-nodes#index-version-inmem) <li>To use TSM in-memory index, add [`index-version = "inmem"`](/enterprise_influxdb/v1.9/administration/config-data-nodes#index-version-inmem) <li>Add [`wal-fsync-delay = "0s"`](/enterprise_influxdb/v1.9/administration/config-data-nodes#wal-fsync-delay-0s) <li>Add [`max-concurrent-compactions = 0`](/enterprise_influxdb/v1.9/administration/config-data-nodes#max-concurrent-compactions-0)<li>Set[`cache-max-memory-size`](/enterprise_influxdb/v1.9/administration/config-data-nodes#cache-max-memory-size-1g) to `1073741824` |
|
||||
| `[cluster]`| <ul><li>Add [`pool-max-idle-streams = 100`](/enterprise_influxdb/v1.9/administration/config-data-nodes#pool-max-idle-streams-100) <li>Add[`pool-max-idle-time = "1m0s"`](/enterprise_influxdb/v1.9/administration/config-data-nodes#pool-max-idle-time-60s) <li>Remove `max-remote-write-connections`
|
||||
|[`[anti-entropy]`](/enterprise_influxdb/v1.9/administration/config-data-nodes#anti-entropy)| <ul><li>Add `enabled = true` <li>Add `check-interval = "30s"` <li>Add `max-fetch = 10`|
|
||||
|`[admin]`| Remove entire section.|
|
||||
|
||||
For more information about TSI, see [TSI overview](/enterprise_influxdb/v1.9/concepts/time-series-index/) and [TSI details](/enterprise_influxdb/v1.9/concepts/tsi-details/).
|
||||
|
||||
### Rebuild TSI indexes
|
||||
|
||||
Complete the following steps for Time Series Index (TSI) only.
|
||||
|
||||
1. Delete all `_series` directories in the `/data` directory (by default, stored at `/data/<dbName>/_series`).
|
||||
|
||||
2. Delete all TSM-based shard `index` directories (by default, located at `/data/<dbName/<rpName>/<shardID>/index`).
|
||||
|
||||
3. Use the [`influx_inspect buildtsi`](/enterprise_influxdb/v1.9/tools/influx_inspect#buildtsi) utility to rebuild the TSI index. For example, run the following command:
|
||||
|
||||
```js
|
||||
influx_inspect buildtsi -datadir /yourDataDirectory -waldir /wal
|
||||
```
|
||||
|
||||
Replacing `yourDataDirectory` with the name of your directory. Running this command converts TSM-based shards to TSI shards or rebuilds existing TSI shards.
|
||||
|
||||
> **Note:** Run the `buildtsi` command using the same system user that runs the `influxd` service, or a user with the same permissions.
|
||||
|
||||
### Restart the `influxdb` service
|
||||
|
||||
Restart the `influxdb` service to restart the data nodes.
|
||||
|
||||
##### sysvinit systems
|
||||
|
||||
```bash
|
||||
service influxdb restart
|
||||
```
|
||||
|
||||
##### systemd systems
|
||||
|
||||
```bash
|
||||
sudo systemctl restart influxdb
|
||||
```
|
||||
|
||||
### Restart traffic to data nodes
|
||||
|
||||
Restart routing read and write requests to the data node server (port 8086) through your load balancer.
|
||||
|
||||
> **Note:** Allow the hinted handoff queue (HHQ) to write all missed data to the updated node before upgrading the next data node. Once all data has been written, the disk space used in the hinted handoff queue should be 0. Check the disk space on your hh directory by running the [`du`] command, for example, `du /var/lib/influxdb/hh`.
|
||||
|
||||
### Confirm the data nodes upgrade
|
||||
|
||||
After upgrading _**all**_ data nodes, check your node version numbers using the
|
||||
`influxd-ctl show` command.
|
||||
The [`influxd-ctl` utility](/enterprise_influxdb/v1.9/tools/influxd-ctl/) is available on all meta nodes.
|
||||
|
||||
```bash
|
||||
~# influxd-ctl show
|
||||
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 rk-upgrading-01:8088 {{< latest-patch >}}-c{{< latest-patch >}} # {{< latest-patch >}}-c{{< latest-patch >}} = 👍
|
||||
5 rk-upgrading-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
6 rk-upgrading-03:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
rk-upgrading-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
rk-upgrading-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
rk-upgrading-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
```
|
||||
|
||||
If you have any issues upgrading your cluster, contact InfluxData support.
|
|
@ -0,0 +1,12 @@
|
|||
---
|
||||
title: InfluxDB Enterprise concepts
|
||||
description: Clustering and other key concepts in InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.9/concepts/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Concepts
|
||||
weight: 50
|
||||
---
|
||||
|
||||
{{< children hlevel="h2" type="list" >}}
|
|
@ -0,0 +1,138 @@
|
|||
---
|
||||
title: Clustering in InfluxDB Enterprise
|
||||
description: >
|
||||
Learn how meta nodes, data nodes, and the Enterprise web server interact in InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.9/concepts/clustering/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Clustering
|
||||
weight: 10
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
This document describes in detail how clustering works in InfluxDB Enterprise. It starts with a high level description of the different components of a cluster and then delves into the implementation details.
|
||||
|
||||
## Architectural overview
|
||||
|
||||
An InfluxDB Enterprise installation consists of three separate software processes: data nodes, meta nodes, and the Enterprise web server. To run an InfluxDB cluster, only the meta and data nodes are required. Communication within a cluster looks like this:
|
||||
|
||||
{{< diagram >}}
|
||||
flowchart TB
|
||||
subgraph meta[Meta Nodes]
|
||||
Meta1 <-- TCP :8089 --> Meta2 <-- TCP :8089 --> Meta3
|
||||
end
|
||||
meta <-- HTTP :8091 --> data
|
||||
subgraph data[Data Nodes]
|
||||
Data1 <-- TCP :8088 --> Data2
|
||||
end
|
||||
{{< /diagram >}}
|
||||
|
||||
|
||||
The meta nodes communicate with each other via a TCP protocol and the Raft consensus protocol that all use port `8089` by default. This port must be reachable between the meta nodes. The meta nodes also expose an HTTP API bound to port `8091` by default that the `influxd-ctl` command uses.
|
||||
|
||||
Data nodes communicate with each other through a TCP protocol that is bound to port `8088`. Data nodes communicate with the meta nodes through their HTTP API bound to `8091`. These ports must be reachable between the meta and data nodes.
|
||||
|
||||
Within a cluster, all meta nodes must communicate with all other meta nodes. All data nodes must communicate with all other data nodes and all meta nodes.
|
||||
|
||||
The meta nodes keep a consistent view of the metadata that describes the cluster. The meta cluster uses the [HashiCorp implementation of Raft](https://github.com/hashicorp/raft) as the underlying consensus protocol. This is the same implementation that they use in Consul.
|
||||
|
||||
The data nodes replicate data and query each other using the Protobuf protocol over TCP. Details on replication and querying are covered later in this document.
|
||||
|
||||
## Where data lives
|
||||
|
||||
The meta and data nodes are each responsible for different parts of the database.
|
||||
|
||||
### Meta nodes
|
||||
|
||||
Meta nodes hold all of the following meta data:
|
||||
|
||||
* all nodes in the cluster and their role
|
||||
* all databases and retention policies that exist in the cluster
|
||||
* all shards and shard groups, and on what nodes they exist
|
||||
* cluster users and their permissions
|
||||
* all continuous queries
|
||||
|
||||
The meta nodes keep this data in the Raft database on disk, backed by BoltDB. By default the Raft database is `/var/lib/influxdb/meta/raft.db`.
|
||||
|
||||
> **Note:** Meta nodes require the `/meta` directory.
|
||||
|
||||
### Data nodes
|
||||
|
||||
Data nodes hold all of the raw time series data and metadata, including:
|
||||
|
||||
* measurements
|
||||
* tag keys and values
|
||||
* field keys and values
|
||||
|
||||
On disk, the data is always organized by `<database>/<retention_policy>/<shard_id>`. By default the parent directory is `/var/lib/influxdb/data`.
|
||||
|
||||
> **Note:** Data nodes require all four subdirectories of `/var/lib/influxdb/`, including `/meta` (specifically, the clients.json file), `/data`, `/wal`, and `/hh`.
|
||||
|
||||
## Optimal server counts
|
||||
|
||||
When creating a cluster, you need to decide how many meta and data nodes to configure and connect. You can think of InfluxDB Enterprise as two separate clusters that communicate with each other: a cluster of meta nodes and one of data nodes. The number of meta nodes is driven by the number of meta node failures they need to be able to handle, while the number of data nodes scales based on your storage and query needs.
|
||||
|
||||
The Raft consensus protocol requires a quorum to perform any operation, so there should always be an odd number of meta nodes. For almost all applications, 3 meta nodes is what you want. It gives you an odd number of meta nodes so that a quorum can be reached. And, if one meta node is lost, the cluster can still operate with the remaining 2 meta nodes until the third one is replaced. Additional meta nodes exponentially increases the communication overhead and is not recommended unless you expect the cluster to frequently lose meta nodes.
|
||||
|
||||
Data nodes hold the actual time series data. The minimum number of data nodes to run is 1 and can scale up from there. **Generally, you'll want to run a number of data nodes that is evenly divisible by your replication factor.** For instance, if you have a replication factor of 2, you'll want to run 2, 4, 6, 8, 10, etc. data nodes.
|
||||
|
||||
## Chronograf
|
||||
|
||||
[Chronograf](/{{< latest "chronograf" >}}/introduction/getting-started/) is the user interface component of InfluxData’s TICK stack.
|
||||
It makes owning the monitoring and alerting for your infrastructure easy to setup and maintain.
|
||||
It talks directly to the data and meta nodes over their HTTP protocols, which are bound by default to ports `8086` for data nodes and port `8091` for meta nodes.
|
||||
|
||||
## Writes in a cluster
|
||||
|
||||
This section describes how writes in a cluster work. We'll work through some examples using a cluster of four data nodes: `A`, `B`, `C`, and `D`. Assume that we have a retention policy with a replication factor of 2 with shard durations of 1 day.
|
||||
|
||||
### Shard groups
|
||||
|
||||
The cluster creates shards within a shard group to maximize the number of data nodes utilized. If there are N data nodes in the cluster and the replication factor is X, then N/X shards are created in each shard group, discarding any fractions.
|
||||
|
||||
This means that a new shard group gets created for each day of data that gets written in. Within each shard group 2 shards are created. Because of the replication factor of 2, each of those two shards are copied on 2 servers. For example we have a shard group for `2016-09-19` that has two shards `1` and `2`. Shard `1` is replicated to servers `A` and `B` while shard `2` is copied to servers `C` and `D`.
|
||||
|
||||
When a write comes in with values that have a timestamp in `2016-09-19` the cluster must first determine which shard within the shard group should receive the write. This is done by taking a hash of the `measurement` + sorted `tagset` (the metaseries) and bucketing into the correct shard. In Go this looks like:
|
||||
|
||||
```go
|
||||
// key is measurement + tagset
|
||||
// shardGroup is the group for the values based on timestamp
|
||||
// hash with fnv and then bucket
|
||||
shard := shardGroup.shards[fnv.New64a(key) % len(shardGroup.Shards)]
|
||||
```
|
||||
|
||||
There are multiple implications to this scheme for determining where data lives in a cluster. First, for any given metaseries all data on any given day exists in a single shard, and thus only on those servers hosting a copy of that shard. Second, once a shard group is created, adding new servers to the cluster won't scale out write capacity for that shard group. The replication is fixed when the shard group is created.
|
||||
|
||||
However, there is a method for expanding writes in the current shard group (i.e. today) when growing a cluster. The current shard group can be truncated to stop at the current time using `influxd-ctl truncate-shards`. This immediately closes the current shard group, forcing a new shard group to be created. That new shard group inherits the latest retention policy and data node changes and then copies itself appropriately to the newly available data nodes. Run `influxd-ctl truncate-shards help` for more information on the command.
|
||||
|
||||
### Write consistency
|
||||
|
||||
Each request to the HTTP API can specify the consistency level via the `consistency` query parameter. For this example let's assume that an HTTP write is being sent to server `D` and the data belongs in shard `1`. The write needs to be replicated to the owners of shard `1`: data nodes `A` and `B`. When the write comes into `D`, that node determines from its local cache of the metastore that the write needs to be replicated to the `A` and `B`, and it immediately tries to write to both. The subsequent behavior depends on the consistency level chosen:
|
||||
|
||||
* `any` - return success to the client as soon as any node has responded with a write success, or the receiving node has written the data to its hinted handoff queue. In our example, if `A` or `B` return a successful write response to `D`, or if `D` has cached the write in its local hinted handoff, `D` returns a write success to the client.
|
||||
* `one` - return success to the client as soon as any node has responded with a write success, but not if the write is only in hinted handoff. In our example, if `A` or `B` return a successful write response to `D`, `D` returns a write success to the client. If `D` could not send the data to either `A` or `B` but instead put the data in hinted handoff, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
|
||||
* `quorum` - return success when a majority of nodes return success. This option is only useful if the replication factor is greater than 2, otherwise it is equivalent to `all`. In our example, if both `A` and `B` return a successful write response to `D`, `D` returns a write success to the client. If either `A` or `B` does not return success, then a majority of nodes have not successfully persisted the write and `D` returns a write failure to the client. If we assume for a moment the data were bound for three nodes, `A`, `B`, and `C`, then if any two of those nodes respond with a write success, `D` returns a write success to the client. If one or fewer nodes respond with a success, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
|
||||
* `all` - return success only when all nodes return success. In our example, if both `A` and `B` return a successful write response to `D`, `D` returns a write success to the client. If either `A` or `B` does not return success, then `D` returns a write failure to the client. If we again assume three destination nodes `A`, `B`, and `C`, then all if three nodes respond with a write success, `D` returns a write success to the client. Otherwise, `D` returns a write failure to the client. Note that this means writes may return a failure and yet the data may eventually persist successfully when hinted handoff drains.
|
||||
|
||||
The important thing to note is how failures are handled. In the case of failures, the database uses the hinted handoff system.
|
||||
|
||||
### Hinted handoff
|
||||
|
||||
Hinted handoff is how InfluxDB Enterprise deals with data node outages while writes are happening. Hinted handoff is essentially a durable disk based queue. When writing at `any`, `one` or `quorum` consistency, hinted handoff is used when one or more replicas return an error after a success has already been returned to the client. When writing at `all` consistency, writes cannot return success unless all nodes return success. Temporarily stalled or failed writes may still go to the hinted handoff queues but the cluster would have already returned a failure response to the write. The receiving node creates a separate queue on disk for each data node (and shard) it cannot reach.
|
||||
|
||||
Let's again use the example of a write coming to `D` that should go to shard `1` on `A` and `B`. If we specified a consistency level of `one` and node `A` returns success, `D` immediately returns success to the client even though the write to `B` is still in progress.
|
||||
|
||||
Now let's assume that `B` returns an error. Node `D` then puts the write into its hinted handoff queue for shard `1` on node `B`. In the background, node `D` continues to attempt to empty the hinted handoff queue by writing the data to node `B`. The configuration file has settings for the maximum size and age of data in hinted handoff queues.
|
||||
|
||||
If a data node is restarted it checks for pending writes in the hinted handoff queues and resume attempts to replicate the writes. The important thing to note is that the hinted handoff queue is durable and does survive a process restart.
|
||||
|
||||
When restarting nodes within an active cluster, during upgrades or maintenance, for example, other nodes in the cluster store hinted handoff writes to the offline node and replicates them when the node is again available. Thus, a healthy cluster should have enough resource headroom on each data node to handle the burst of hinted handoff writes following a node outage. The returning node needs to handle both the steady state traffic and the queued hinted handoff writes from other nodes, meaning its write traffic will have a significant spike following any outage of more than a few seconds, until the hinted handoff queue drains.
|
||||
|
||||
If a node with pending hinted handoff writes for another data node receives a write destined for that node, it adds the write to the end of the hinted handoff queue rather than attempt a direct write. This ensures that data nodes receive data in mostly chronological order, as well as preventing unnecessary connection attempts while the other node is offline.
|
||||
|
||||
## Queries in a cluster
|
||||
|
||||
Queries in a cluster are distributed based on the time range being queried and the replication factor of the data. For example if the retention policy has a replication factor of 4, the coordinating data node receiving the query randomly picks any of the 4 data nodes that store a replica of the shard(s) to receive the query. If we assume that the system has shard durations of one day, then for each day of time covered by a query the coordinating node selects one data node to receive the query for that day.
|
||||
|
||||
The coordinating node executes and fulfill the query locally whenever possible. If a query must scan multiple shard groups (multiple days in the example above), the coordinating node forwards queries to other nodes for shards it does not have locally. The queries are forwarded in parallel to scanning its own local data. The queries are distributed to as many nodes as required to query each shard group once. As the results come back from each data node, the coordinating data node combines them into the final result that gets returned to the user.
|
|
@ -0,0 +1,219 @@
|
|||
---
|
||||
title: Compare InfluxDB to SQL databases
|
||||
description: Differences between InfluxDB and SQL databases.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Compare InfluxDB to SQL databases
|
||||
weight: 30
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
InfluxDB is similar to a SQL database, but different in many ways.
|
||||
InfluxDB is purpose-built for time series data.
|
||||
Relational databases _can_ handle time series data, but are not optimized for common time series workloads.
|
||||
InfluxDB is designed to store large volumes of time series data and quickly perform real-time analysis on that data.
|
||||
|
||||
### Timing is everything
|
||||
|
||||
In InfluxDB, a timestamp identifies a single point in any given data series.
|
||||
This is like an SQL database table where the primary key is pre-set by the system and is always time.
|
||||
|
||||
InfluxDB also recognizes that your [schema](/enterprise_influxdb/v1.9/concepts/glossary/#schema) preferences may change over time.
|
||||
In InfluxDB you don't have to define schemas up front.
|
||||
Data points can have one of the fields on a measurement, all of the fields on a measurement, or any number in-between.
|
||||
You can add new fields to a measurement simply by writing a point for that new field.
|
||||
If you need an explanation of the terms measurements, tags, and fields check out the next section for an SQL database to InfluxDB terminology crosswalk.
|
||||
|
||||
## Terminology
|
||||
|
||||
The table below is a (very) simple example of a table called `foodships` in an SQL database
|
||||
with the unindexed column `#_foodships` and the indexed columns `park_id`, `planet`, and `time`.
|
||||
|
||||
``` sql
|
||||
+---------+---------+---------------------+--------------+
|
||||
| park_id | planet | time | #_foodships |
|
||||
+---------+---------+---------------------+--------------+
|
||||
| 1 | Earth | 1429185600000000000 | 0 |
|
||||
| 1 | Earth | 1429185601000000000 | 3 |
|
||||
| 1 | Earth | 1429185602000000000 | 15 |
|
||||
| 1 | Earth | 1429185603000000000 | 15 |
|
||||
| 2 | Saturn | 1429185600000000000 | 5 |
|
||||
| 2 | Saturn | 1429185601000000000 | 9 |
|
||||
| 2 | Saturn | 1429185602000000000 | 10 |
|
||||
| 2 | Saturn | 1429185603000000000 | 14 |
|
||||
| 3 | Jupiter | 1429185600000000000 | 20 |
|
||||
| 3 | Jupiter | 1429185601000000000 | 21 |
|
||||
| 3 | Jupiter | 1429185602000000000 | 21 |
|
||||
| 3 | Jupiter | 1429185603000000000 | 20 |
|
||||
| 4 | Saturn | 1429185600000000000 | 5 |
|
||||
| 4 | Saturn | 1429185601000000000 | 5 |
|
||||
| 4 | Saturn | 1429185602000000000 | 6 |
|
||||
| 4 | Saturn | 1429185603000000000 | 5 |
|
||||
+---------+---------+---------------------+--------------+
|
||||
```
|
||||
|
||||
Those same data look like this in InfluxDB:
|
||||
|
||||
```sql
|
||||
name: foodships
|
||||
tags: park_id=1, planet=Earth
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 0
|
||||
2015-04-16T12:00:01Z 3
|
||||
2015-04-16T12:00:02Z 15
|
||||
2015-04-16T12:00:03Z 15
|
||||
|
||||
name: foodships
|
||||
tags: park_id=2, planet=Saturn
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 5
|
||||
2015-04-16T12:00:01Z 9
|
||||
2015-04-16T12:00:02Z 10
|
||||
2015-04-16T12:00:03Z 14
|
||||
|
||||
name: foodships
|
||||
tags: park_id=3, planet=Jupiter
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 20
|
||||
2015-04-16T12:00:01Z 21
|
||||
2015-04-16T12:00:02Z 21
|
||||
2015-04-16T12:00:03Z 20
|
||||
|
||||
name: foodships
|
||||
tags: park_id=4, planet=Saturn
|
||||
time #_foodships
|
||||
---- ------------
|
||||
2015-04-16T12:00:00Z 5
|
||||
2015-04-16T12:00:01Z 5
|
||||
2015-04-16T12:00:02Z 6
|
||||
2015-04-16T12:00:03Z 5
|
||||
```
|
||||
|
||||
Referencing the example above, in general:
|
||||
|
||||
* An InfluxDB measurement (`foodships`) is similar to an SQL database table.
|
||||
* InfluxDB tags ( `park_id` and `planet`) are like indexed columns in an SQL database.
|
||||
* InfluxDB fields (`#_foodships`) are like unindexed columns in an SQL database.
|
||||
* InfluxDB points (for example, `2015-04-16T12:00:00Z 5`) are similar to SQL rows.
|
||||
|
||||
Building on this comparison of database terminology,
|
||||
InfluxDB [continuous queries](/enterprise_influxdb/v1.9/concepts/glossary/#continuous-query-cq)
|
||||
and [retention policies](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp) are
|
||||
similar to stored procedures in an SQL database.
|
||||
They're specified once and then performed regularly and automatically.
|
||||
|
||||
Of course, there are some major disparities between SQL databases and InfluxDB.
|
||||
SQL `JOIN`s aren't available for InfluxDB measurements; your schema design should reflect that difference.
|
||||
And, as we mentioned above, a measurement is like an SQL table where the primary index is always pre-set to time.
|
||||
InfluxDB timestamps must be in UNIX epoch (GMT) or formatted as a date-time string valid under RFC3339.
|
||||
|
||||
For more detailed descriptions of the InfluxDB terms mentioned in this section see our [Glossary of Terms](/enterprise_influxdb/v1.9/concepts/glossary/).
|
||||
|
||||
## Query languages
|
||||
InfluxDB supports multiple query languages:
|
||||
|
||||
- [Flux](#flux)
|
||||
- [InfluxQL](#influxql)
|
||||
|
||||
### Flux
|
||||
|
||||
[Flux](/enterprise_influxdb/v1.9/flux/) is a data scripting language designed for querying, analyzing, and acting on time series data.
|
||||
Beginning with **InfluxDB 1.8.0**, Flux is available for production use along side InfluxQL.
|
||||
|
||||
For those familiar with [InfluxQL](#influxql), Flux is intended to address
|
||||
many of the outstanding feature requests that we've received since introducing InfluxDB 1.0.
|
||||
For a comparison between Flux and InfluxQL, see [Flux vs InfluxQL](/enterprise_influxdb/v1.9/flux/flux-vs-influxql/).
|
||||
|
||||
Flux is the primary language for working with data in [InfluxDB OSS 2.0](/influxdb/v2.0/get-started)
|
||||
and [InfluxDB Cloud](/influxdb/cloud/get-started/),
|
||||
a generally available Platform as a Service (PaaS) available across multiple Cloud Service Providers.
|
||||
Using Flux with InfluxDB 1.8+ lets you get familiar with Flux concepts and syntax
|
||||
and ease the transition to InfluxDB 2.0.
|
||||
|
||||
### InfluxQL
|
||||
|
||||
InfluxQL is an SQL-like query language for interacting with InfluxDB.
|
||||
It has been crafted to feel familiar to those coming from other
|
||||
SQL or SQL-like environments while also providing features specific
|
||||
to storing and analyzing time series data.
|
||||
However **InfluxQL is not SQL** and lacks support for more advanced operations
|
||||
like `UNION`, `JOIN` and `HAVING` that SQL power-users are accustomed to.
|
||||
This functionality is available with [Flux](/flux/latest/introduction).
|
||||
|
||||
InfluxQL's `SELECT` statement follows the form of an SQL `SELECT` statement:
|
||||
|
||||
```sql
|
||||
SELECT <stuff> FROM <measurement_name> WHERE <some_conditions>
|
||||
```
|
||||
|
||||
where `WHERE` is optional.
|
||||
|
||||
To get the InfluxDB output in the section above, you'd enter:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships"
|
||||
```
|
||||
|
||||
If you only wanted to see data for the planet `Saturn`, you'd enter:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships" WHERE "planet" = 'Saturn'
|
||||
```
|
||||
|
||||
If you wanted to see data for the planet `Saturn` after 12:00:01 UTC on April 16, 2015, you'd enter:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships" WHERE "planet" = 'Saturn' AND time > '2015-04-16 12:00:01'
|
||||
```
|
||||
|
||||
As shown in the example above, InfluxQL allows you to specify the time range of your query in the `WHERE` clause.
|
||||
You can use date-time strings wrapped in single quotes that have the
|
||||
format `YYYY-MM-DD HH:MM:SS.mmm`
|
||||
(`mmm` is milliseconds and is optional, and you can also specify microseconds or nanoseconds).
|
||||
You can also use relative time with `now()` which refers to the server's current timestamp:
|
||||
|
||||
```sql
|
||||
SELECT * FROM "foodships" WHERE time > now() - 1h
|
||||
```
|
||||
|
||||
That query outputs the data in the `foodships` measure where the timestamp is newer than the server's current time minus one hour.
|
||||
The options for specifying time durations with `now()` are:
|
||||
|
||||
|Letter|Meaning|
|
||||
|:---:|:---:|
|
||||
| ns | nanoseconds |
|
||||
|u or µ|microseconds|
|
||||
| ms | milliseconds |
|
||||
|s | seconds |
|
||||
| m | minutes |
|
||||
| h | hours |
|
||||
| d | days |
|
||||
| w | weeks |
|
||||
|
||||
InfluxQL also supports regular expressions, arithmetic in expressions, `SHOW` statements, and `GROUP BY` statements.
|
||||
See our [data exploration](/enterprise_influxdb/v1.9/query_language/explore-data/) page for an in-depth discussion of those topics.
|
||||
InfluxQL functions include `COUNT`, `MIN`, `MAX`, `MEDIAN`, `DERIVATIVE` and more.
|
||||
For a full list check out the [functions](/enterprise_influxdb/v1.9/query_language/functions/) page.
|
||||
|
||||
Now that you have the general idea, check out our [Getting Started Guide](/enterprise_influxdb/v1.9/introduction/getting-started/).
|
||||
|
||||
## InfluxDB is not CRUD
|
||||
|
||||
InfluxDB is a database that has been optimized for time series data.
|
||||
This data commonly comes from sources like distributed sensor groups, click data from large websites, or lists of financial transactions.
|
||||
|
||||
One thing this data has in common is that it is more useful in the aggregate.
|
||||
One reading saying that your computer’s CPU is at 12% utilization at 12:38:35 UTC on a Tuesday is hard to draw conclusions from.
|
||||
It becomes more useful when combined with the rest of the series and visualized.
|
||||
This is where trends over time begin to show, and actionable insight can be drawn from the data.
|
||||
In addition, time series data is generally written once and rarely updated.
|
||||
|
||||
The result is that InfluxDB is not a full CRUD database but more like a CR-ud, prioritizing the performance of creating and reading data over update and destroy, and [preventing some update and destroy behaviors](/enterprise_influxdb/v1.9/concepts/insights_tradeoffs/) to make create and read more performant:
|
||||
|
||||
* To update a point, insert one with [the same measurement, tag set, and timestamp](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points).
|
||||
* You can [drop or delete a series](/enterprise_influxdb/v1.9/query_language/manage-database/#drop-series-from-the-index-with-drop-series), but not individual points based on field values. As a workaround, you can search for the field value, retrieve the time, then [DELETE based on the `time` field](/enterprise_influxdb/v1.9/query_language/manage-database/#delete-series-with-delete).
|
||||
* You can't update or rename tags yet - see GitHub issue [#4157](https://github.com/influxdata/influxdb/issues/4157) for more information. To modify the tag of a series of points, find the points with the offending tag value, change the value to the desired one, write the points back, then drop the series with the old tag value.
|
||||
* You can't delete tags by tag key (as opposed to value) - see GitHub issue [#8604](https://github.com/influxdata/influxdb/issues/8604).
|
|
@ -0,0 +1,457 @@
|
|||
---
|
||||
title: Glossary
|
||||
description: Terms related to InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/concepts/glossary/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 20
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
## data node
|
||||
|
||||
A node that runs the data service.
|
||||
|
||||
For high availability, installations must have at least two data nodes.
|
||||
The number of data nodes in your cluster must be the same as your highest
|
||||
replication factor.
|
||||
Any replication factor greater than two gives you additional fault tolerance and
|
||||
query capacity within the cluster.
|
||||
|
||||
Data node sizes will depend on your needs.
|
||||
The Amazon EC2 m4.large or m4.xlarge are good starting points.
|
||||
|
||||
Related entries: [data service](#data-service), [replication factor](#replication-factor)
|
||||
|
||||
## data service
|
||||
|
||||
Stores all time series data and handles all writes and queries.
|
||||
|
||||
Related entries: [data node](#data-node)
|
||||
|
||||
## meta node
|
||||
|
||||
A node that runs the meta service.
|
||||
|
||||
For high availability, installations must have three meta nodes.
|
||||
Meta nodes can be very modestly sized instances like an EC2 t2.micro or even a
|
||||
nano.
|
||||
For additional fault tolerance installations may use five meta nodes; the
|
||||
number of meta nodes must be an odd number.
|
||||
|
||||
Related entries: [meta service](#meta-service)
|
||||
|
||||
## meta service
|
||||
|
||||
The consistent data store that keeps state about the cluster, including which
|
||||
servers, databases, users, continuous queries, retention policies, subscriptions,
|
||||
and blocks of time exist.
|
||||
|
||||
Related entries: [meta node](#meta-node)
|
||||
|
||||
## replication factor
|
||||
|
||||
The attribute of the retention policy that determines how many copies of the
|
||||
data are stored in the cluster.
|
||||
InfluxDB replicates data across `N` data nodes, where `N` is the replication
|
||||
factor.
|
||||
|
||||
To maintain data availability for queries, the replication factor should be less
|
||||
than or equal to the number of data nodes in the cluster:
|
||||
|
||||
* Data is fully available when the replication factor is greater than the
|
||||
number of unavailable data nodes.
|
||||
* Data may be unavailable when the replication factor is less than the number of
|
||||
unavailable data nodes.
|
||||
|
||||
Any replication factor greater than two gives you additional fault tolerance and
|
||||
query capacity within the cluster.
|
||||
|
||||
## web console
|
||||
|
||||
Legacy user interface for the InfluxDB Enterprise.
|
||||
|
||||
This has been deprecated and the suggestion is to use [Chronograf](/{{< latest "chronograf" >}}/introduction/).
|
||||
|
||||
If you are transitioning from the Enterprise Web Console to Chronograf, see how to [transition from the InfluxDB Web Admin Interface](/chronograf/v1.7/guides/transition-web-admin-interface/).
|
||||
|
||||
<!-- --- -->
|
||||
|
||||
## aggregation
|
||||
|
||||
An InfluxQL function that returns an aggregated value across a set of points.
|
||||
For a complete list of the available and upcoming aggregations, see [InfluxQL functions](/enterprise_influxdb/v1.9/query_language/functions/#aggregations).
|
||||
|
||||
Related entries: [function](#function), [selector](#selector), [transformation](#transformation)
|
||||
|
||||
## batch
|
||||
|
||||
A collection of data points in InfluxDB line protocol format, separated by newlines (`0x0A`).
|
||||
A batch of points may be submitted to the database using a single HTTP request to the write endpoint.
|
||||
This makes writes using the InfluxDB API much more performant by drastically reducing the HTTP overhead.
|
||||
InfluxData recommends batch sizes of 5,000-10,000 points, although different use cases may be better served by significantly smaller or larger batches.
|
||||
|
||||
Related entries: [InfluxDB line protocol](#influxdb-line-protocol), [point](#point)
|
||||
|
||||
## bucket
|
||||
|
||||
A bucket is a named location where time series data is stored in **InfluxDB 2.0**. In InfluxDB 1.8+, each combination of a database and a retention policy (database/retention-policy) represents a bucket. Use the [InfluxDB 2.0 API compatibility endpoints](/enterprise_influxdb/v1.9/tools/api#influxdb-2-0-api-compatibility-endpoints) included with InfluxDB 1.8+ to interact with buckets.
|
||||
|
||||
## continuous query (CQ)
|
||||
|
||||
An InfluxQL query that runs automatically and periodically within a database.
|
||||
Continuous queries require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
|
||||
See [Continuous Queries](/enterprise_influxdb/v1.9/query_language/continuous_queries/).
|
||||
|
||||
|
||||
Related entries: [function](#function)
|
||||
|
||||
## database
|
||||
|
||||
A logical container for users, retention policies, continuous queries, and time series data.
|
||||
|
||||
Related entries: [continuous query](#continuous-query-cq), [retention policy](#retention-policy-rp), [user](#user)
|
||||
|
||||
## duration
|
||||
|
||||
The attribute of the retention policy that determines how long InfluxDB stores data.
|
||||
Data older than the duration are automatically dropped from the database.
|
||||
See [Database Management](/enterprise_influxdb/v1.9/query_language/manage-database/#create-retention-policies-with-create-retention-policy) for how to set duration.
|
||||
|
||||
Related entries: [retention policy](#retention-policy-rp)
|
||||
|
||||
## field
|
||||
|
||||
The key-value pair in an InfluxDB data structure that records metadata and the actual data value.
|
||||
Fields are required in InfluxDB data structures and they are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant relative to tags.
|
||||
|
||||
*Query tip:* Compare fields to tags; tags are indexed.
|
||||
|
||||
Related entries: [field key](#field-key), [field set](#field-set), [field value](#field-value), [tag](#tag)
|
||||
|
||||
## field key
|
||||
|
||||
The key part of the key-value pair that makes up a field.
|
||||
Field keys are strings and they store metadata.
|
||||
|
||||
Related entries: [field](#field), [field set](#field-set), [field value](#field-value), [tag key](#tag-key)
|
||||
|
||||
## field set
|
||||
|
||||
The collection of field keys and field values on a point.
|
||||
|
||||
Related entries: [field](#field), [field key](#field-key), [field value](#field-value), [point](#point)
|
||||
|
||||
## field value
|
||||
|
||||
The value part of the key-value pair that makes up a field.
|
||||
Field values are the actual data; they can be strings, floats, integers, or booleans.
|
||||
A field value is always associated with a timestamp.
|
||||
|
||||
Field values are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant.
|
||||
|
||||
*Query tip:* Compare field values to tag values; tag values are indexed.
|
||||
|
||||
Related entries: [field](#field), [field key](#field-key), [field set](#field-set), [tag value](#tag-value), [timestamp](#timestamp)
|
||||
|
||||
## function
|
||||
|
||||
InfluxQL aggregations, selectors, and transformations.
|
||||
See [InfluxQL Functions](/enterprise_influxdb/v1.9/query_language/functions/) for a complete list of InfluxQL functions.
|
||||
|
||||
Related entries: [aggregation](#aggregation), [selector](#selector), [transformation](#transformation)
|
||||
|
||||
## identifier
|
||||
|
||||
Tokens that refer to continuous query names, database names, field keys,
|
||||
measurement names, retention policy names, subscription names, tag keys, and
|
||||
user names.
|
||||
See [Query Language Specification](/enterprise_influxdb/v1.9/query_language/spec/#identifiers).
|
||||
|
||||
Related entries:
|
||||
[database](#database),
|
||||
[field key](#field-key),
|
||||
[measurement](#measurement),
|
||||
[retention policy](#retention-policy-rp),
|
||||
[tag key](#tag-key),
|
||||
[user](#user)
|
||||
|
||||
## InfluxDB line protocol
|
||||
|
||||
The text based format for writing points to InfluxDB. See [InfluxDB line protocol](/enterprise_influxdb/v1.9/write_protocols/).
|
||||
|
||||
## measurement
|
||||
|
||||
The part of the InfluxDB data structure that describes the data stored in the associated fields.
|
||||
Measurements are strings.
|
||||
|
||||
Related entries: [field](#field), [series](#series)
|
||||
|
||||
## metastore
|
||||
|
||||
Contains internal information about the status of the system.
|
||||
The metastore contains the user information, databases, retention policies, shard metadata, continuous queries, and subscriptions.
|
||||
|
||||
Related entries: [database](#database), [retention policy](#retention-policy-rp), [user](#user)
|
||||
|
||||
## node
|
||||
|
||||
An independent `influxd` process.
|
||||
|
||||
Related entries: [server](#server)
|
||||
|
||||
## now()
|
||||
|
||||
The local server's nanosecond timestamp.
|
||||
|
||||
## point
|
||||
|
||||
In InfluxDB, a point represents a single data record, similar to a row in a SQL database table. Each point:
|
||||
|
||||
- has a measurement, a tag set, a field key, a field value, and a timestamp;
|
||||
- is uniquely identified by its series and timestamp.
|
||||
|
||||
You cannot store more than one point with the same timestamp in a series.
|
||||
If you write a point to a series with a timestamp that matches an existing point, the field set becomes a union of the old and new field set, and any ties go to the new field set.
|
||||
For more information about duplicate points, see [How does InfluxDB handle duplicate points?](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)
|
||||
|
||||
Related entries: [field set](#field-set), [series](#series), [timestamp](#timestamp)
|
||||
|
||||
## points per second
|
||||
|
||||
A deprecated measurement of the rate at which data are persisted to InfluxDB.
|
||||
The schema allows and even encourages the recording of multiple metric values per point, rendering points per second ambiguous.
|
||||
|
||||
Write speeds are generally quoted in values per second, a more precise metric.
|
||||
|
||||
Related entries: [point](#point), [schema](#schema), [values per second](#values-per-second)
|
||||
|
||||
## query
|
||||
|
||||
An operation that retrieves data from InfluxDB.
|
||||
See [Data Exploration](/enterprise_influxdb/v1.9/query_language/explore-data/), [Schema Exploration](/enterprise_influxdb/v1.9/query_language/explore-schema/), [Database Management](/enterprise_influxdb/v1.9/query_language/manage-database/).
|
||||
|
||||
## replication factor
|
||||
|
||||
The attribute of the retention policy that determines how many copies of data to concurrently store (or retain) in the cluster. Replicating copies ensures that data is available when a data node (or more) is unavailable.
|
||||
|
||||
For three nodes or less, the default replication factor equals the number of data nodes.
|
||||
For more than three nodes, the default replication factor is 3. To change the default replication factor, specify the replication factor `n` in the retention policy.
|
||||
|
||||
Related entries: [duration](#duration), [node](#node),
|
||||
[retention policy](#retention-policy-rp)
|
||||
|
||||
## retention policy (RP)
|
||||
|
||||
Describes how long InfluxDB keeps data (duration), how many copies of the data to store in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.
|
||||
|
||||
When you create a database, InfluxDB creates a retention policy called `autogen` with an infinite duration, a replication factor set to one, and a shard group duration set to seven days.
|
||||
For more information, see [Retention policy management](/enterprise_influxdb/v1.9/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
Related entries: [duration](#duration), [measurement](#measurement), [replication factor](#replication-factor), [series](#series), [shard duration](#shard-duration), [tag set](#tag-set)
|
||||
|
||||
## schema
|
||||
|
||||
How the data are organized in InfluxDB.
|
||||
The fundamentals of the InfluxDB schema are databases, retention policies, series, measurements, tag keys, tag values, and field keys.
|
||||
See [Schema Design](/enterprise_influxdb/v1.9/concepts/schema_and_data_layout/) for more information.
|
||||
|
||||
Related entries: [database](#database), [field key](#field-key), [measurement](#measurement), [retention policy](#retention-policy-rp), [series](#series), [tag key](#tag-key), [tag value](#tag-value)
|
||||
|
||||
## selector
|
||||
|
||||
An InfluxQL function that returns a single point from the range of specified points.
|
||||
See [InfluxQL Functions](/enterprise_influxdb/v1.9/query_language/functions/#selectors) for a complete list of the available and upcoming selectors.
|
||||
|
||||
Related entries: [aggregation](#aggregation), [function](#function), [transformation](#transformation)
|
||||
|
||||
## series
|
||||
|
||||
A logical grouping of data defined by shared measurement, tag set, and field key.
|
||||
|
||||
Related entries: [field set](#field-set), [measurement](#measurement), [tag set](#tag-set)
|
||||
|
||||
## series cardinality
|
||||
|
||||
The number of unique database, measurement, tag set, and field key combinations in an InfluxDB instance.
|
||||
|
||||
For example, assume that an InfluxDB instance has a single database and one measurement.
|
||||
The single measurement has two tag keys: `email` and `status`.
|
||||
If there are three different `email`s, and each email address is associated with two
|
||||
different `status`es then the series cardinality for the measurement is 6
|
||||
(3 * 2 = 6):
|
||||
|
||||
| email | status |
|
||||
| :-------------------- | :----- |
|
||||
| lorr@influxdata.com | start |
|
||||
| lorr@influxdata.com | finish |
|
||||
| marv@influxdata.com | start |
|
||||
| marv@influxdata.com | finish |
|
||||
| cliff@influxdata.com | start |
|
||||
| cliff@influxdata.com | finish |
|
||||
|
||||
Note that, in some cases, simply performing that multiplication may overestimate series cardinality because of the presence of dependent tags.
|
||||
Dependent tags are tags that are scoped by another tag and do not increase series
|
||||
cardinality.
|
||||
If we add the tag `firstname` to the example above, the series cardinality
|
||||
would not be 18 (3 * 2 * 3 = 18).
|
||||
It would remain unchanged at 6, as `firstname` is already scoped by the `email` tag:
|
||||
|
||||
| email | status | firstname |
|
||||
| :-------------------- | :----- | :-------- |
|
||||
| lorr@influxdata.com | start | lorraine |
|
||||
| lorr@influxdata.com | finish | lorraine |
|
||||
| marv@influxdata.com | start | marvin |
|
||||
| marv@influxdata.com | finish | marvin |
|
||||
| cliff@influxdata.com | start | clifford |
|
||||
| cliff@influxdata.com | finish | clifford |
|
||||
|
||||
See [SHOW CARDINALITY](/enterprise_influxdb/v1.9/query_language/spec/#show-cardinality) to learn about the InfluxQL commands for series cardinality.
|
||||
|
||||
Related entries: [field key](#field-key),[measurement](#measurement), [tag key](#tag-key), [tag set](#tag-set)
|
||||
|
||||
## series key
|
||||
|
||||
A series key identifies a particular series by measurement, tag set, and field key.
|
||||
|
||||
For example:
|
||||
|
||||
```
|
||||
# measurement, tag set, field key
|
||||
h2o_level, location=santa_monica, h2o_feet
|
||||
```
|
||||
|
||||
Related entries: [series](#series)
|
||||
|
||||
## server
|
||||
|
||||
A machine, virtual or physical, that is running InfluxDB.
|
||||
There should only be one InfluxDB process per server.
|
||||
|
||||
Related entries: [node](#node)
|
||||
|
||||
## shard
|
||||
|
||||
A shard contains the actual encoded and compressed data, and is represented by a TSM file on disk.
|
||||
Every shard belongs to one and only one shard group.
|
||||
Multiple shards may exist in a single shard group.
|
||||
Each shard contains a specific set of series.
|
||||
All points falling on a given series in a given shard group will be stored in the same shard (TSM file) on disk.
|
||||
|
||||
Related entries: [series](#series), [shard duration](#shard-duration), [shard group](#shard-group), [tsm](#tsm-time-structured-merge-tree)
|
||||
|
||||
## shard duration
|
||||
|
||||
The shard duration determines how much time each shard group spans.
|
||||
The specific interval is determined by the `SHARD DURATION` of the retention policy.
|
||||
See [Retention Policy management](/enterprise_influxdb/v1.9/query_language/manage-database/#retention-policy-management) for more information.
|
||||
|
||||
For example, given a retention policy with `SHARD DURATION` set to `1w`, each shard group will span a single week and contain all points with timestamps in that week.
|
||||
|
||||
Related entries: [database](#database), [retention policy](#retention-policy-rp), [series](#series), [shard](#shard), [shard group](#shard-group)
|
||||
|
||||
## shard group
|
||||
|
||||
Shard groups are logical containers for shards.
|
||||
Shard groups are organized by time and retention policy.
|
||||
Every retention policy that contains data has at least one associated shard group.
|
||||
A given shard group contains all shards with data for the interval covered by the shard group.
|
||||
The interval spanned by each shard group is the shard duration.
|
||||
|
||||
Related entries: [database](#database), [retention policy](#retention-policy-rp), [series](#series), [shard](#shard), [shard duration](#shard-duration)
|
||||
|
||||
## subscription
|
||||
|
||||
Subscriptions allow [Kapacitor](/{{< latest "kapacitor" >}}/) to receive data from InfluxDB in a push model rather than the pull model based on querying data.
|
||||
When Kapacitor is configured to work with InfluxDB, the subscription will automatically push every write for the subscribed database from InfluxDB to Kapacitor.
|
||||
Subscriptions can use TCP or UDP for transmitting the writes.
|
||||
|
||||
## tag
|
||||
|
||||
The key-value pair in the InfluxDB data structure that records metadata.
|
||||
Tags are an optional part of the data structure, but they are useful for storing commonly-queried metadata; tags are indexed so queries on tags are performant.
|
||||
*Query tip:* Compare tags to fields; fields are not indexed.
|
||||
|
||||
Related entries: [field](#field), [tag key](#tag-key), [tag set](#tag-set), [tag value](#tag-value)
|
||||
|
||||
## tag key
|
||||
|
||||
The key part of the key-value pair that makes up a tag.
|
||||
Tag keys are strings and they store metadata.
|
||||
Tag keys are indexed so queries on tag keys are performant.
|
||||
|
||||
*Query tip:* Compare tag keys to field keys; field keys are not indexed.
|
||||
|
||||
Related entries: [field key](#field-key), [tag](#tag), [tag set](#tag-set), [tag value](#tag-value)
|
||||
|
||||
## tag set
|
||||
|
||||
The collection of tag keys and tag values on a point.
|
||||
|
||||
Related entries: [point](#point), [series](#series), [tag](#tag), [tag key](#tag-key), [tag value](#tag-value)
|
||||
|
||||
## tag value
|
||||
|
||||
The value part of the key-value pair that makes up a tag.
|
||||
Tag values are strings and they store metadata.
|
||||
Tag values are indexed so queries on tag values are performant.
|
||||
|
||||
|
||||
Related entries: [tag](#tag), [tag key](#tag-key), [tag set](#tag-set)
|
||||
|
||||
## timestamp
|
||||
|
||||
The date and time associated with a point.
|
||||
All time in InfluxDB is UTC.
|
||||
|
||||
For how to specify time when writing data, see [Write Syntax](/enterprise_influxdb/v1.9/write_protocols/write_syntax/).
|
||||
For how to specify time when querying data, see [Data Exploration](/enterprise_influxdb/v1.9/query_language/explore-data/#time-syntax).
|
||||
|
||||
Related entries: [point](#point)
|
||||
|
||||
## transformation
|
||||
|
||||
An InfluxQL function that returns a value or a set of values calculated from specified points, but does not return an aggregated value across those points.
|
||||
See [InfluxQL Functions](/enterprise_influxdb/v1.9/query_language/functions/#transformations) for a complete list of the available and upcoming aggregations.
|
||||
|
||||
Related entries: [aggregation](#aggregation), [function](#function), [selector](#selector)
|
||||
|
||||
## TSM (Time Structured Merge tree)
|
||||
|
||||
The purpose-built data storage format for InfluxDB. TSM allows for greater compaction and higher write and read throughput than existing B+ or LSM tree implementations. See [Storage Engine](/enterprise_influxdb/v1.9/concepts/storage_engine/) for more.
|
||||
|
||||
## user
|
||||
|
||||
There are two kinds of users in InfluxDB:
|
||||
|
||||
* *Admin users* have `READ` and `WRITE` access to all databases and full access to administrative queries and user management commands.
|
||||
* *Non-admin users* have `READ`, `WRITE`, or `ALL` (both `READ` and `WRITE`) access per database.
|
||||
|
||||
When authentication is enabled, InfluxDB only executes HTTP requests that are sent with a valid username and password.
|
||||
See [Authentication and Authorization](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/).
|
||||
|
||||
## values per second
|
||||
|
||||
The preferred measurement of the rate at which data are persisted to InfluxDB. Write speeds are generally quoted in values per second.
|
||||
|
||||
To calculate the values per second rate, multiply the number of points written per second by the number of values stored per point. For example, if the points have four fields each, and a batch of 5000 points is written 10 times per second, then the values per second rate is `4 field values per point * 5000 points per batch * 10 batches per second = 200,000 values per second`.
|
||||
|
||||
Related entries: [batch](#batch), [field](#field), [point](#point), [points per second](#points-per-second)
|
||||
|
||||
## WAL (Write Ahead Log)
|
||||
|
||||
The temporary cache for recently written points. To reduce the frequency with which the permanent storage files are accessed, InfluxDB caches new points in the WAL until their total size or age triggers a flush to more permanent storage. This allows for efficient batching of the writes into the TSM.
|
||||
|
||||
Points in the WAL can be queried, and they persist through a system reboot. On process start, all points in the WAL must be flushed before the system accepts new writes.
|
||||
|
||||
Related entries: [tsm](#tsm-time-structured-merge-tree)
|
||||
|
||||
<!--
|
||||
|
||||
|
||||
|
||||
## shard
|
||||
|
||||
## shard group
|
||||
-->
|
|
@ -0,0 +1,85 @@
|
|||
---
|
||||
title: InfluxDB Enterprise startup process
|
||||
description: >
|
||||
On startup, InfluxDB Enterprise starts all subsystems and services in a deterministic order.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 10
|
||||
name: Startup process
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
On startup, InfluxDB Enterprise starts all subsystems and services in the following order:
|
||||
|
||||
1. [TSDBStore](#tsdbstore)
|
||||
2. [Monitor](#monitor)
|
||||
3. [Cluster](#cluster)
|
||||
4. [Precreator](#precreator)
|
||||
5. [Snapshotter](#snapshotter)
|
||||
6. [Continuous Query](#continuous-query)
|
||||
7. [Announcer](#announcer)
|
||||
8. [Retention](#retention)
|
||||
9. [Stats](#stats)
|
||||
10. [Anti-entropy](#anti-entropy)
|
||||
11. [HTTP API](#http-api)
|
||||
|
||||
A **subsystem** is a collection of related services managed together as part of a greater whole.
|
||||
A **service** is a process that provides specific functionality.
|
||||
|
||||
## Subsystems and services
|
||||
|
||||
### TSDBStore
|
||||
The TSDBStore subsystem starts and manages the TSM storage engine.
|
||||
This includes services such as the points writer (write), reads (query),
|
||||
and [hinted handoff (HH)](/enterprise_influxdb/v1.9/concepts/clustering/#hinted-handoff).
|
||||
TSDBSTore first opens all the shards and loads write-ahead log (WAL) data into the in-memory write cache.
|
||||
If `influxd` was cleanly shutdown previously, there will not be any WAL data.
|
||||
It then loads a portion of each shard's index.
|
||||
|
||||
{{% note %}}
|
||||
#### Index versions and startup times
|
||||
If using `inmem` indexing, InfluxDB loads all shard indexes into memory, which,
|
||||
depending on the number of series in the database, can take time.
|
||||
If using `tsi1` indexing, InfluxDB only loads hot shard indexes
|
||||
(the most recent shards or shards currently being written to) into memory and
|
||||
stores cold shard indexes on disk.
|
||||
Use `tsi1` indexing to see shorter startup times.
|
||||
{{% /note %}}
|
||||
|
||||
### Monitor
|
||||
The Monitor service provides statistical and diagnostic information to InfluxDB about InfluxDB itself.
|
||||
This information helps with database troubleshooting and performance analysis.
|
||||
|
||||
### Cluster
|
||||
The Cluster service provides implementations of InfluxDB OSS v1.8 interfaces
|
||||
that operate on an InfluxDB Enterprise v1.8 cluster.
|
||||
|
||||
### Precreator
|
||||
The Precreator service creates shards before they are needed.
|
||||
This ensures necessary shards exist before new time series data arrives and that
|
||||
write-throughput is not affected the creation of a new shard.
|
||||
|
||||
### Snapshotter
|
||||
The Snapshotter service routinely creates snapshots of InfluxDB Enterprise metadata.
|
||||
|
||||
### Continuous Query
|
||||
The Continuous Query (CQ) subsystem manages all InfluxDB CQs.
|
||||
|
||||
### Announcer
|
||||
The Announcer service announces a data node's status to meta nodes.
|
||||
|
||||
### Retention
|
||||
The Retention service enforces [retention policies](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp)
|
||||
and drops data as it expires.
|
||||
|
||||
### Stats
|
||||
The Stats service monitors cluster-level statistics.
|
||||
|
||||
### Anti-entropy
|
||||
The Anti-entropy (AE) subsystem is responsible for reconciling differences between shards.
|
||||
For more information, see [Use anti-entropy](/enterprise_influxdb/v1.9/administration/anti-entropy/).
|
||||
|
||||
### HTTP API
|
||||
The InfluxDB HTTP API service provides a public facing interface to interact with
|
||||
InfluxDB Enterprise and internal interfaces used within the InfluxDB Enterprise cluster.
|
||||
|
|
@ -0,0 +1,60 @@
|
|||
---
|
||||
title: InfluxDB design insights and tradeoffs
|
||||
description: >
|
||||
Optimizing for time series use case entails some tradeoffs, primarily to increase performance at the cost of functionality.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: InfluxDB design insights and tradeoffs
|
||||
weight: 40
|
||||
parent: Concepts
|
||||
v2: /influxdb/v2.0/reference/key-concepts/design-principles/
|
||||
---
|
||||
|
||||
InfluxDB is a time series database.
|
||||
Optimizing for this use case entails some tradeoffs, primarily to increase performance at the cost of functionality.
|
||||
Below is a list of some of those design insights that lead to tradeoffs:
|
||||
|
||||
1. For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times.
|
||||
|
||||
_**Pro:**_ Simplified [conflict resolution](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points) increases write performance.
|
||||
_**Con:**_ Cannot store duplicate data; may overwrite data in rare circumstances.
|
||||
|
||||
2. Deletes are a rare occurrence.
|
||||
When they do occur it is almost always against large ranges of old data that are cold for writes.
|
||||
|
||||
_**Pro:**_ Restricting access to deletes allows for increased query and write performance.
|
||||
_**Con:**_ Delete functionality is significantly restricted.
|
||||
|
||||
3. Updates to existing data are a rare occurrence and contentious updates never happen.
|
||||
Time series data is predominantly new data that is never updated.
|
||||
|
||||
_**Pro:**_ Restricting access to updates allows for increased query and write performance.
|
||||
_**Con:**_ Update functionality is significantly restricted.
|
||||
|
||||
4. The vast majority of writes are for data with very recent timestamps and the data is added in time ascending order.
|
||||
|
||||
_**Pro:**_ Adding data in time ascending order is significantly more performant.
|
||||
_**Con:**_ Writing points with random times or with time not in ascending order is significantly less performant.
|
||||
|
||||
5. Scale is critical.
|
||||
The database must be able to handle a *high* volume of reads and writes.
|
||||
|
||||
_**Pro:**_ The database can handle a *high* volume of reads and writes.
|
||||
_**Con:**_ The InfluxDB development team was forced to make tradeoffs to increase performance.
|
||||
|
||||
6. Being able to write and query the data is more important than having a strongly consistent view.
|
||||
|
||||
_**Pro:**_ Writing and querying the database can be done by multiple clients and at high loads.
|
||||
_**Con:**_ Query returns may not include the most recent points if database is under heavy load.
|
||||
|
||||
7. Many time [series](/enterprise_influxdb/v1.9/concepts/glossary/#series) are ephemeral.
|
||||
There are often time series that appear only for a few hours and then go away, e.g.
|
||||
a new host that gets started and reports for a while and then gets shut down.
|
||||
|
||||
_**Pro:**_ InfluxDB is good at managing discontinuous data.
|
||||
_**Con:**_ Schema-less design means that some database functions are not supported e.g. there are no cross table joins.
|
||||
|
||||
8. No one point is too important.
|
||||
|
||||
_**Pro:**_ InfluxDB has very powerful tools to deal with aggregate data and large data sets.
|
||||
_**Con:**_ Points don't have IDs in the traditional sense, they are differentiated by timestamp and series.
|
|
@ -0,0 +1,202 @@
|
|||
---
|
||||
title: InfluxDB key concepts
|
||||
description: Covers key concepts to learn about InfluxDB.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Key concepts
|
||||
weight: 10
|
||||
parent: Concepts
|
||||
v2: /influxdb/v2.0/reference/key-concepts/
|
||||
---
|
||||
|
||||
Before diving into InfluxDB, it's good to get acquainted with some key concepts of the database. This document introduces key InfluxDB concepts and elements. To introduce the key concepts, we’ll cover how the following elements work together in InfluxDB:
|
||||
|
||||
- [database](/enterprise_influxdb/v1.9/concepts/glossary/#database)
|
||||
- [field key](/enterprise_influxdb/v1.9/concepts/glossary/#field-key)
|
||||
- [field set](/enterprise_influxdb/v1.9/concepts/glossary/#field-set)
|
||||
- [field value](/enterprise_influxdb/v1.9/concepts/glossary/#field-value)
|
||||
- [measurement](/enterprise_influxdb/v1.9/concepts/glossary/#measurement)
|
||||
- [point](/enterprise_influxdb/v1.9/concepts/glossary/#point)
|
||||
- [retention policy](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp)
|
||||
- [series](/enterprise_influxdb/v1.9/concepts/glossary/#series)
|
||||
- [tag key](/enterprise_influxdb/v1.9/concepts/glossary/#tag-key)
|
||||
- [tag set](/enterprise_influxdb/v1.9/concepts/glossary/#tag-set)
|
||||
- [tag value](/enterprise_influxdb/v1.9/concepts/glossary/#tag-value)
|
||||
- [timestamp](/enterprise_influxdb/v1.9/concepts/glossary/#timestamp)
|
||||
|
||||
|
||||
### Sample data
|
||||
|
||||
The next section references the data printed out below.
|
||||
The data is fictional, but represents a believable setup in InfluxDB.
|
||||
They show the number of butterflies and honeybees counted by two scientists (`langstroth` and `perpetua`) in two locations (location `1` and location `2`) over the time period from August 18, 2015 at midnight through August 18, 2015 at 6:12 AM.
|
||||
Assume that the data lives in a database called `my_database` and are subject to the `autogen` retention policy (more on databases and retention policies to come).
|
||||
|
||||
*Hint:* Hover over the links for tooltips to get acquainted with InfluxDB terminology and the layout.
|
||||
|
||||
**name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
|
||||
|
||||
| time | <span class ="tooltip" data-tooltip-text ="Field key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Field key">honeybees</span> | <span class ="tooltip" data-tooltip-text ="Tag key">location</span> | <span class ="tooltip" data-tooltip-text ="Tag key">scientist</span> |
|
||||
| ---- | ------------------------------------------------------------------------ | ---------------------------------------------------------------------- | ------------------------------------------------------------------- | -------------------------------------------------------------------- |
|
||||
| 2015-08-18T00:00:00Z | 12 | 23 | 1 | langstroth |
|
||||
| 2015-08-18T00:00:00Z | 1 | 30 | 1 | perpetua |
|
||||
| 2015-08-18T00:06:00Z | 11 | 28 | 1 | langstroth |
|
||||
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">3</span> | <span class ="tooltip" data-tooltip-text ="Field value">28</span> | <span class ="tooltip" data-tooltip-text ="Tag value">1</span> | <span class ="tooltip" data-tooltip-text ="Tag value">perpetua</span> |
|
||||
| 2015-08-18T05:54:00Z | 2 | 11 | 2 | langstroth |
|
||||
| 2015-08-18T06:00:00Z | 1 | 10 | 2 | langstroth |
|
||||
| 2015-08-18T06:06:00Z | 8 | 23 | 2 | perpetua |
|
||||
| 2015-08-18T06:12:00Z | 7 | 22 | 2 | perpetua |
|
||||
|
||||
### Discussion
|
||||
|
||||
Now that you've seen some sample data in InfluxDB this section covers what it all means.
|
||||
|
||||
InfluxDB is a time series database so it makes sense to start with what is at the root of everything we do: time.
|
||||
In the data above there's a column called `time` - all data in InfluxDB have that column.
|
||||
`time` stores timestamps, and the <a name="timestamp"></a>_**timestamp**_ shows the date and time, in [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) UTC, associated with particular data.
|
||||
|
||||
The next two columns, called `butterflies` and `honeybees`, are fields.
|
||||
Fields are made up of field keys and field values.
|
||||
<a name="field-key"></a>_**Field keys**_ (`butterflies` and `honeybees`) are strings; the field key `butterflies` tells us that the field values `12`-`7` refer to butterflies and the field key `honeybees` tells us that the field values `23`-`22` refer to, well, honeybees.
|
||||
|
||||
<a name="field-value"></a>_**Field values**_ are your data; they can be strings, floats, integers, or Booleans, and, because InfluxDB is a time series database, a field value is always associated with a timestamp.
|
||||
The field values in the sample data are:
|
||||
|
||||
```
|
||||
12 23
|
||||
1 30
|
||||
11 28
|
||||
3 28
|
||||
2 11
|
||||
1 10
|
||||
8 23
|
||||
7 22
|
||||
```
|
||||
|
||||
In the data above, the collection of field-key and field-value pairs make up a <a name="field-set"></a>_**field set**_.
|
||||
Here are all eight field sets in the sample data:
|
||||
|
||||
* `butterflies = 12 honeybees = 23`
|
||||
* `butterflies = 1 honeybees = 30`
|
||||
* `butterflies = 11 honeybees = 28`
|
||||
* `butterflies = 3 honeybees = 28`
|
||||
* `butterflies = 2 honeybees = 11`
|
||||
* `butterflies = 1 honeybees = 10`
|
||||
* `butterflies = 8 honeybees = 23`
|
||||
* `butterflies = 7 honeybees = 22`
|
||||
|
||||
Fields are a required piece of the InfluxDB data structure - you cannot have data in InfluxDB without fields.
|
||||
It's also important to note that fields are not indexed.
|
||||
[Queries](/enterprise_influxdb/v1.9/concepts/glossary/#query) that use field values as filters must scan all values that match the other conditions in the query.
|
||||
As a result, those queries are not performant relative to queries on tags (more on tags below).
|
||||
In general, fields should not contain commonly-queried metadata.
|
||||
|
||||
The last two columns in the sample data, called `location` and `scientist`, are tags.
|
||||
Tags are made up of tag keys and tag values.
|
||||
Both <a name="tag-key"></a>_**tag keys**_ and <a name="tag-value"></a>_**tag values**_ are stored as strings and record metadata.
|
||||
The tag keys in the sample data are `location` and `scientist`.
|
||||
The tag key `location` has two tag values: `1` and `2`.
|
||||
The tag key `scientist` also has two tag values: `langstroth` and `perpetua`.
|
||||
|
||||
In the data above, the <a name="tag-set"></a>_**tag set**_ is the different combinations of all the tag key-value pairs.
|
||||
The four tag sets in the sample data are:
|
||||
|
||||
* `location = 1`, `scientist = langstroth`
|
||||
* `location = 2`, `scientist = langstroth`
|
||||
* `location = 1`, `scientist = perpetua`
|
||||
* `location = 2`, `scientist = perpetua`
|
||||
|
||||
Tags are optional.
|
||||
You don't need to have tags in your data structure, but it's generally a good idea to make use of them because, unlike fields, tags are indexed.
|
||||
This means that queries on tags are faster and that tags are ideal for storing commonly-queried metadata.
|
||||
|
||||
Avoid using the following reserved keys:
|
||||
|
||||
* `_field`
|
||||
* `_measurement`
|
||||
* `time`
|
||||
|
||||
If reserved keys are included as a tag or field key, the associated point is discarded.
|
||||
|
||||
> **Why indexing matters: The schema case study**
|
||||
|
||||
> Say you notice that most of your queries focus on the values of the field keys `honeybees` and `butterflies`:
|
||||
|
||||
> `SELECT * FROM "census" WHERE "butterflies" = 1`
|
||||
> `SELECT * FROM "census" WHERE "honeybees" = 23`
|
||||
|
||||
> Because fields aren't indexed, InfluxDB scans every value of `butterflies` in the first query and every value of `honeybees` in the second query before it provides a response.
|
||||
That behavior can hurt query response times - especially on a much larger scale.
|
||||
To optimize your queries, it may be beneficial to rearrange your [schema](/enterprise_influxdb/v1.9/concepts/glossary/#schema) such that the fields (`butterflies` and `honeybees`) become the tags and the tags (`location` and `scientist`) become the fields:
|
||||
|
||||
> **name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
|
||||
>
|
||||
| time | <span class ="tooltip" data-tooltip-text ="Field key">location</span> | <span class ="tooltip" data-tooltip-text ="Field key">scientist</span> | <span class ="tooltip" data-tooltip-text ="Tag key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Tag key">honeybees</span> |
|
||||
| ---- | --------------------------------------------------------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------- |
|
||||
| 2015-08-18T00:00:00Z | 1 | langstroth | 12 | 23 |
|
||||
| 2015-08-18T00:00:00Z | 1 | perpetua | 1 | 30 |
|
||||
| 2015-08-18T00:06:00Z | 1 | langstroth | 11 | 28 |
|
||||
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">1</span> | <span class ="tooltip" data-tooltip-text ="Field value">perpetua</span> | <span class ="tooltip" data-tooltip-text ="Tag value">3</span> | <span class ="tooltip" data-tooltip-text ="Tag value">28</span> |
|
||||
| 2015-08-18T05:54:00Z | 2 | langstroth | 2 | 11 |
|
||||
| 2015-08-18T06:00:00Z | 2 | langstroth | 1 | 10 |
|
||||
| 2015-08-18T06:06:00Z | 2 | perpetua | 8 | 23 |
|
||||
| 2015-08-18T06:12:00Z | 2 | perpetua | 7 | 22 |
|
||||
|
||||
> Now that `butterflies` and `honeybees` are tags, InfluxDB won't have to scan every one of their values when it performs the queries above - this means that your queries are even faster.
|
||||
|
||||
The <a name=measurement></a>_**measurement**_ acts as a container for tags, fields, and the `time` column, and the measurement name is the description of the data that are stored in the associated fields.
|
||||
Measurement names are strings, and, for any SQL users out there, a measurement is conceptually similar to a table.
|
||||
The only measurement in the sample data is `census`.
|
||||
The name `census` tells us that the field values record the number of `butterflies` and `honeybees` - not their size, direction, or some sort of happiness index.
|
||||
|
||||
A single measurement can belong to different retention policies.
|
||||
A <a name="retention-policy"></a>_**retention policy**_ describes how long InfluxDB keeps data (`DURATION`) and how many copies of this data is stored in the cluster (`REPLICATION`).
|
||||
If you're interested in reading more about retention policies, check out [Database Management](/enterprise_influxdb/v1.9/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
{{% warn %}} Replication factors do not serve a purpose with single node instances.
|
||||
{{% /warn %}}
|
||||
|
||||
In the sample data, everything in the `census` measurement belongs to the `autogen` retention policy.
|
||||
InfluxDB automatically creates that retention policy; it has an infinite duration and a replication factor set to one.
|
||||
|
||||
Now that you're familiar with measurements, tag sets, and retention policies, let's discuss series.
|
||||
In InfluxDB, a <a name=series></a>_**series**_ is a collection of points that share a measurement, tag set, and field key.
|
||||
The data above consist of eight series:
|
||||
|
||||
| Series number | Measurement | Tag set | Field key |
|
||||
|:------------------------ | ----------- | ------- | --------- |
|
||||
| series 1 | `census` | `location = 1`,`scientist = langstroth` | `butterflies` |
|
||||
| series 2 | `census` | `location = 2`,`scientist = langstroth` | `butterflies` |
|
||||
| series 3 | `census` | `location = 1`,`scientist = perpetua` | `butterflies` |
|
||||
| series 4 | `census` | `location = 2`,`scientist = perpetua` | `butterflies` |
|
||||
| series 5 | `census` | `location = 1`,`scientist = langstroth` | `honeybees` |
|
||||
| series 6 | `census` | `location = 2`,`scientist = langstroth` | `honeybees` |
|
||||
| series 7 | `census` | `location = 1`,`scientist = perpetua` | `honeybees` |
|
||||
| series 8 | `census` | `location = 2`,`scientist = perpetua` | `honeybees` |
|
||||
|
||||
Understanding the concept of a series is essential when designing your [schema](/enterprise_influxdb/v1.9/concepts/glossary/#schema) and when working with your data in InfluxDB.
|
||||
|
||||
A <a name="point"></a>_**point**_ represents a single data record that has four components: a measurement, tag set, field set, and a timestamp. A point is uniquely identified by its series and timestamp.
|
||||
|
||||
For example, here's a single point:
|
||||
```
|
||||
name: census
|
||||
-----------------
|
||||
time butterflies honeybees location scientist
|
||||
2015-08-18T00:00:00Z 1 30 1 perpetua
|
||||
```
|
||||
|
||||
The point in this example is part of series 3 and 7 and defined by the measurement (`census`), the tag set (`location = 1`, `scientist = perpetua`), the field set (`butterflies = 1`, `honeybees = 30`), and the timestamp `2015-08-18T00:00:00Z`.
|
||||
|
||||
All of the stuff we've just covered is stored in a database - the sample data are in the database `my_database`.
|
||||
An InfluxDB <a name=database></a>_**database**_ is similar to traditional relational databases and serves as a logical container for users, retention policies, continuous queries, and, of course, your time series data.
|
||||
See [Authentication and Authorization](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/) and [Continuous Queries](/enterprise_influxdb/v1.9/query_language/continuous_queries/) for more on those topics.
|
||||
|
||||
Databases can have several users, continuous queries, retention policies, and measurements.
|
||||
InfluxDB is a schemaless database which means it's easy to add new measurements, tags, and fields at any time.
|
||||
It's designed to make working with time series data awesome.
|
||||
|
||||
You made it!
|
||||
You've covered the fundamental concepts and terminology in InfluxDB.
|
||||
If you're just starting out, we recommend taking a look at [Getting Started](/enterprise_influxdb/v1.9/introduction/getting_started/) and the [Writing Data](/enterprise_influxdb/v1.9/guides/writing_data/) and [Querying Data](/enterprise_influxdb/v1.9/guides/querying_data/) guides.
|
||||
May our time series database serve you well 🕔.
|
|
@ -0,0 +1,251 @@
|
|||
---
|
||||
title: InfluxDB schema design and data layout
|
||||
description: >
|
||||
General guidelines for InfluxDB schema design and data layout.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Schema design and data layout
|
||||
weight: 50
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
Every InfluxDB use case is special and your [schema](/enterprise_influxdb/v1.9/concepts/glossary/#schema) will reflect that uniqueness.
|
||||
There are, however, general guidelines to follow and pitfalls to avoid when designing your schema.
|
||||
|
||||
<table style="width:100%">
|
||||
<tr>
|
||||
<td><a href="#general-recommendations">General Recommendations</a></td>
|
||||
<td><a href="#encouraged-schema-design">Encouraged Schema Design</a></td>
|
||||
<td><a href="#discouraged-schema-design">Discouraged Schema Design</a></td>
|
||||
<td><a href="#shard-group-duration-management">Shard Group Duration Management</a></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## General recommendations
|
||||
|
||||
### Encouraged schema design
|
||||
|
||||
We recommend that you:
|
||||
|
||||
- [Encode meta data in tags](#encode-meta-data-in-tags)
|
||||
- [Avoid using keywords as tag or field names](#avoid-using-keywords-as-tag-or-field-names)
|
||||
|
||||
#### Encode meta data in tags
|
||||
|
||||
[Tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) are indexed and [fields](/enterprise_influxdb/v1.9/concepts/glossary/#field) are not indexed.
|
||||
This means that queries on tags are more performant than those on fields.
|
||||
|
||||
In general, your queries should guide what gets stored as a tag and what gets stored as a field:
|
||||
|
||||
- Store commonly-queried meta data in tags
|
||||
- Store data in tags if you plan to use them with the InfluxQL `GROUP BY` clause
|
||||
- Store data in fields if you plan to use them with an [InfluxQL](/enterprise_influxdb/v1.9/query_language/functions/) function
|
||||
- Store numeric values as fields ([tag values](/enterprise_influxdb/v1.9/concepts/glossary/#tag-value) only support string values)
|
||||
|
||||
#### Avoid using keywords as tag or field names
|
||||
|
||||
Not required, but simplifies writing queries because you won't have to wrap tag or field names in double quotes.
|
||||
See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux](https://github.com/influxdata/flux/blob/master/docs/SPEC.md#keywords) keywords to avoid.
|
||||
|
||||
Also, if a tag or field name contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/{{< latest "influxdb" "v2" >}}/query-data/get-started/syntax-basics/#records) in Flux.
|
||||
|
||||
### Discouraged schema design
|
||||
|
||||
We recommend that you:
|
||||
|
||||
- [Avoid too many series](#avoid-too-many-series)
|
||||
- [Avoid the same name for a tag and a field](#avoid-the-same-name-for-a-tag-and-a-field)
|
||||
- [Avoid encoding data in measurement names](#avoid-encoding-data-in-measurement-names)
|
||||
- [Avoid putting more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
|
||||
|
||||
#### Avoid too many series
|
||||
|
||||
[Tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) containing highly variable information like UUIDs, hashes, and random strings lead to a large number of [series](/enterprise_influxdb/v1.9/concepts/glossary/#series) in the database, also known as high series cardinality. High series cardinality is a primary driver of high memory usage for many database workloads.
|
||||
|
||||
See [Hardware sizing guidelines](/enterprise_influxdb/v1.9/reference/hardware_sizing/) for [series cardinality](/enterprise_influxdb/v1.9/concepts/glossary/#series-cardinality) recommendations based on your hardware. If the system has memory constraints, consider storing high-cardinality data as a field rather than a tag.
|
||||
|
||||
#### Avoid the same name for a tag and a field
|
||||
|
||||
Avoid using the same name for a tag and field key.
|
||||
This often results in unexpected behavior when querying data.
|
||||
|
||||
If you inadvertently add the same name for a tag and field key, see
|
||||
[Frequently asked questions](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#tag-and-field-key-with-the-same-name)
|
||||
for information about how to query the data predictably and how to fix the issue.
|
||||
|
||||
#### Avoid encoding data in measurement names
|
||||
|
||||
InfluxDB queries merge data that falls within the same [measurement](/enterprise_influxdb/v1.9/concepts/glossary/#measurement); it's better to differentiate data with [tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) than with detailed measurement names. If you encode data in a measurement name, you must use a regular expression to query the data, making some queries more complicated or impossible.
|
||||
|
||||
_Example:_
|
||||
|
||||
Consider the following schema represented by line protocol.
|
||||
|
||||
```
|
||||
Schema 1 - Data encoded in the measurement name
|
||||
-------------
|
||||
blueberries.plot-1.north temp=50.1 1472515200000000000
|
||||
blueberries.plot-2.midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
The long measurement names (`blueberries.plot-1.north`) with no tags are similar to Graphite metrics.
|
||||
Encoding the `plot` and `region` in the measurement name makes the data more difficult to query.
|
||||
|
||||
For example, calculating the average temperature of both plots 1 and 2 is not possible with schema 1.
|
||||
Compare this to schema 2:
|
||||
|
||||
```
|
||||
Schema 2 - Data encoded in tags
|
||||
-------------
|
||||
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
|
||||
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region:
|
||||
|
||||
##### Flux
|
||||
|
||||
```js
|
||||
// Schema 1 - Query for data encoded in the measurement name
|
||||
from(bucket:"<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
|
||||
|> mean()
|
||||
|
||||
// Schema 2 - Query for data encoded in tags
|
||||
from(bucket:"<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|
||||
|> mean()
|
||||
```
|
||||
|
||||
##### InfluxQL
|
||||
|
||||
```
|
||||
# Schema 1 - Query for data encoded in the measurement name
|
||||
> SELECT mean("temp") FROM /\.north$/
|
||||
|
||||
# Schema 2 - Query for data encoded in tags
|
||||
> SELECT mean("temp") FROM "weather_sensor" WHERE "region" = 'north'
|
||||
```
|
||||
|
||||
### Avoid putting more than one piece of information in one tag
|
||||
|
||||
Splitting a single tag with multiple pieces into separate tags simplifies your queries and reduces the need for regular expressions.
|
||||
|
||||
Consider the following schema represented by line protocol.
|
||||
|
||||
```
|
||||
Schema 1 - Multiple data encoded in a single tag
|
||||
-------------
|
||||
weather_sensor,crop=blueberries,location=plot-1.north temp=50.1 1472515200000000000
|
||||
weather_sensor,crop=blueberries,location=plot-2.midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
The Schema 1 data encodes multiple separate parameters, the `plot` and `region` into a long tag value (`plot-1.north`).
|
||||
Compare this to the following schema represented in line protocol.
|
||||
|
||||
```
|
||||
Schema 2 - Data encoded in multiple tags
|
||||
-------------
|
||||
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
|
||||
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
|
||||
```
|
||||
|
||||
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region.
|
||||
Schema 2 is preferable because using multiple tags, you don't need a regular expression.
|
||||
|
||||
##### Flux
|
||||
|
||||
```js
|
||||
// Schema 1 - Query for multiple data encoded in a single tag
|
||||
from(bucket:"<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
|
||||
|> mean()
|
||||
|
||||
// Schema 2 - Query for data encoded in multiple tags
|
||||
from(bucket:"<database>/<retention_policy>")
|
||||
|> range(start:2016-08-30T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|
||||
|> mean()
|
||||
```
|
||||
|
||||
##### InfluxQL
|
||||
|
||||
```
|
||||
# Schema 1 - Query for multiple data encoded in a single tag
|
||||
> SELECT mean("temp") FROM "weather_sensor" WHERE location =~ /\.north$/
|
||||
|
||||
# Schema 2 - Query for data encoded in multiple tags
|
||||
> SELECT mean("temp") FROM "weather_sensor" WHERE region = 'north'
|
||||
```
|
||||
|
||||
## Shard group duration management
|
||||
|
||||
### Shard group duration overview
|
||||
|
||||
InfluxDB stores data in shard groups.
|
||||
Shard groups are organized by [retention policy](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp) (RP) and store data with timestamps that fall within a specific time interval called the [shard duration](/enterprise_influxdb/v1.9/concepts/glossary/#shard-duration).
|
||||
|
||||
If no shard group duration is provided, the shard group duration is determined by the RP [duration](/enterprise_influxdb/v1.9/concepts/glossary/#duration) at the time the RP is created. The default values are:
|
||||
|
||||
| RP Duration | Shard Group Duration |
|
||||
|---|---|
|
||||
| < 2 days | 1 hour |
|
||||
| >= 2 days and <= 6 months | 1 day |
|
||||
| > 6 months | 7 days |
|
||||
|
||||
The shard group duration is also configurable per RP.
|
||||
To configure the shard group duration, see [Retention Policy Management](/enterprise_influxdb/v1.9/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
### Shard group duration tradeoffs
|
||||
|
||||
Determining the optimal shard group duration requires finding the balance between:
|
||||
|
||||
- Better overall performance with longer shards
|
||||
- Flexibility provided by shorter shards
|
||||
|
||||
#### Long shard group duration
|
||||
|
||||
Longer shard group durations let InfluxDB store more data in the same logical location.
|
||||
This reduces data duplication, improves compression efficiency, and improves query speed in some cases.
|
||||
|
||||
#### Short shard group duration
|
||||
|
||||
Shorter shard group durations allow the system to more efficiently drop data and record incremental backups.
|
||||
When InfluxDB enforces an RP it drops entire shard groups, not individual data points, even if the points are older than the RP duration.
|
||||
A shard group will only be removed once a shard group's duration *end time* is older than the RP duration.
|
||||
|
||||
For example, if your RP has a duration of one day, InfluxDB will drop an hour's worth of data every hour and will always have 25 shard groups. One for each hour in the day and an extra shard group that is partially expiring, but isn't removed until the whole shard group is older than 24 hours.
|
||||
|
||||
>**Note:** A special use case to consider: filtering queries on schema data (such as tags, series, measurements) by time. For example, if you want to filter schema data within a one hour interval, you must set the shard group duration to 1h. For more information, see [filter schema data by time](/enterprise_influxdb/v1.9/query_language/explore-schema/#filter-meta-queries-by-time).
|
||||
|
||||
### Shard group duration recommendations
|
||||
|
||||
The default shard group durations work well for most cases. However, high-throughput or long-running instances will benefit from using longer shard group durations.
|
||||
Here are some recommendations for longer shard group durations:
|
||||
|
||||
| RP Duration | Shard Group Duration |
|
||||
|---|---|
|
||||
| <= 1 day | 6 hours |
|
||||
| > 1 day and <= 7 days | 1 day |
|
||||
| > 7 days and <= 3 months | 7 days |
|
||||
| > 3 months | 30 days |
|
||||
| infinite | 52 weeks or longer |
|
||||
|
||||
> **Note:** Note that `INF` (infinite) is not a [valid shard group duration](/enterprise_influxdb/v1.9/query_language/manage-database/#retention-policy-management).
|
||||
In extreme cases where data covers decades and will never be deleted, a long shard group duration like `1040w` (20 years) is perfectly valid.
|
||||
|
||||
Other factors to consider before setting shard group duration:
|
||||
|
||||
* Shard groups should be twice as long as the longest time range of the most frequent queries
|
||||
* Shard groups should each contain more than 100,000 [points](/enterprise_influxdb/v1.9/concepts/glossary/#point) per shard group
|
||||
* Shard groups should each contain more than 1,000 points per [series](/enterprise_influxdb/v1.9/concepts/glossary/#series)
|
||||
|
||||
#### Shard group duration for backfilling
|
||||
|
||||
Bulk insertion of historical data covering a large time range in the past will trigger the creation of a large number of shards at once.
|
||||
The concurrent access and overhead of writing to hundreds or thousands of shards can quickly lead to slow performance and memory exhaustion.
|
||||
|
||||
When writing historical data, we highly recommend temporarily setting a longer shard group duration so fewer shards are created. Typically, a shard group duration of 52 weeks works well for backfilling.
|
|
@ -0,0 +1,438 @@
|
|||
---
|
||||
title: In-memory indexing and the Time-Structured Merge Tree (TSM)
|
||||
description: >
|
||||
InfluxDB storage engine, in-memory indexing, and the Time-Structured Merge Tree (TSM) in InfluxDB OSS.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: In-memory indexing with TSM
|
||||
weight: 60
|
||||
parent: Concepts
|
||||
v2: /influxdb/v2.0/reference/internals/storage-engine/
|
||||
---
|
||||
|
||||
## The InfluxDB storage engine and the Time-Structured Merge Tree (TSM)
|
||||
|
||||
The InfluxDB storage engine looks very similar to a LSM Tree.
|
||||
It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree.
|
||||
TSM files contain sorted, compressed series data.
|
||||
|
||||
InfluxDB will create a [shard](/enterprise_influxdb/v1.9/concepts/glossary/#shard) for each block of time.
|
||||
For example, if you have a [retention policy](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp) with an unlimited duration, shards will be created for each 7 day block of time.
|
||||
Each of these shards maps to an underlying storage engine database.
|
||||
Each of these databases has its own [WAL](/enterprise_influxdb/v1.9/concepts/glossary/#wal-write-ahead-log) and TSM files.
|
||||
|
||||
We'll dig into each of these parts of the storage engine.
|
||||
|
||||
## Storage engine
|
||||
|
||||
The storage engine ties a number of components together and provides the external interface for storing and querying series data. It is composed of a number of components that each serve a particular role:
|
||||
|
||||
* In-Memory Index - The in-memory index is a shared index across shards that provides the quick access to [measurements](/enterprise_influxdb/v1.9/concepts/glossary/#measurement), [tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag), and [series](/enterprise_influxdb/v1.9/concepts/glossary/#series). The index is used by the engine, but is not specific to the storage engine itself.
|
||||
* WAL - The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable. Writes to the WAL are appended to segments of a fixed size.
|
||||
* Cache - The Cache is an in-memory representation of the data stored in the WAL. It is queried at runtime and merged with the data stored in TSM files.
|
||||
* TSM Files - TSM files store compressed series data in a columnar format.
|
||||
* FileStore - The FileStore mediates access to all TSM files on disk. It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used.
|
||||
* Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones.
|
||||
* Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other.
|
||||
* Compression - Compression is handled by various Encoders and Decoders for specific data types. Some encoders are fairly static and always encode the same type the same way; others switch their compression strategy based on the shape of the data.
|
||||
* Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats.
|
||||
|
||||
### Write Ahead Log (WAL)
|
||||
|
||||
The WAL is organized as a bunch of files that look like `_000001.wal`.
|
||||
The file numbers are monotonically increasing and referred to as WAL segments.
|
||||
When a segment reaches 10MB in size, it is closed and a new one is opened. Each WAL segment stores multiple compressed blocks of writes and deletes.
|
||||
|
||||
When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file.
|
||||
The file is `fsync`'d and the data is added to an in-memory index before a success is returned.
|
||||
This means that batching points together is required to achieve high throughput performance.
|
||||
(Optimal batch size seems to be 5,000-10,000 points per batch for many use cases.)
|
||||
|
||||
Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block.
|
||||
|
||||
### Cache
|
||||
|
||||
The Cache is an in-memory copy of all data points current stored in the WAL.
|
||||
The points are organized by the key, which is the measurement, [tag set](/enterprise_influxdb/v1.9/concepts/glossary/#tag-set), and unique [field](/enterprise_influxdb/v1.9/concepts/glossary/#field).
|
||||
Each field is kept as its own time-ordered range.
|
||||
The Cache data is not compressed while in memory.
|
||||
|
||||
Queries to the storage engine will merge data from the Cache with data from the TSM files.
|
||||
Queries execute on a copy of the data that is made from the cache at query processing time.
|
||||
This way writes that come in while a query is running won't affect the result.
|
||||
|
||||
Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
|
||||
|
||||
The Cache exposes a few controls for snapshotting behavior.
|
||||
The two most important controls are the memory limits.
|
||||
There is a lower bound, [`cache-snapshot-memory-size`](/enterprise_influxdb/v1.9/administration/config#cache-snapshot-memory-size-25m), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments.
|
||||
There is also an upper bound, [`cache-max-memory-size`](/enterprise_influxdb/v1.9/administration/config#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes.
|
||||
These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it.
|
||||
The checks for memory thresholds occur on every write.
|
||||
|
||||
The other snapshot controls are time based.
|
||||
The idle threshold, [`cache-snapshot-write-cold-duration`](/enterprise_influxdb/v1.9/administration/config#cache-snapshot-write-cold-duration-10m), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval.
|
||||
|
||||
The in-memory Cache is recreated on restart by re-reading the WAL files on disk.
|
||||
|
||||
### TSM files
|
||||
|
||||
TSM files are a collection of read-only files that are memory mapped.
|
||||
The structure of these files looks very similar to an SSTable in LevelDB or other LSM Tree variants.
|
||||
|
||||
A TSM file is composed of four sections: header, blocks, index, and footer.
|
||||
|
||||
```
|
||||
+--------+------------------------------------+-------------+--------------+
|
||||
| Header | Blocks | Index | Footer |
|
||||
|5 bytes | N bytes | N bytes | 4 bytes |
|
||||
+--------+------------------------------------+-------------+--------------+
|
||||
```
|
||||
|
||||
The Header is a magic number to identify the file type and a version number.
|
||||
|
||||
```
|
||||
+-------------------+
|
||||
| Header |
|
||||
+-------------------+
|
||||
| Magic │ Version |
|
||||
| 4 bytes │ 1 byte |
|
||||
+-------------------+
|
||||
```
|
||||
|
||||
Blocks are sequences of pairs of CRC32 checksums and data.
|
||||
The block data is opaque to the file.
|
||||
The CRC32 is used for block level error detection.
|
||||
The length of the blocks is stored in the index.
|
||||
|
||||
```
|
||||
+--------------------------------------------------------------------+
|
||||
│ Blocks │
|
||||
+---------------------+-----------------------+----------------------+
|
||||
| Block 1 | Block 2 | Block N |
|
||||
+---------------------+-----------------------+----------------------+
|
||||
| CRC | Data | CRC | Data | CRC | Data |
|
||||
| 4 bytes | N bytes | 4 bytes | N bytes | 4 bytes | N bytes |
|
||||
+---------------------+-----------------------+----------------------+
|
||||
```
|
||||
|
||||
Following the blocks is the index for the blocks in the file.
|
||||
The index is composed of a sequence of index entries ordered lexicographically by key and then by time.
|
||||
The key includes the measurement name, tag set, and one field.
|
||||
Multiple fields per point creates multiple index entries in the TSM file.
|
||||
Each index entry starts with a key length and the key, followed by the block type (float, int, bool, string) and a count of the number of index block entries that follow for that key.
|
||||
Each index block entry is composed of the min and max time for the block, the offset into the file where the block is located and the size of the block. There is one index block entry for each block in the TSM file that contains the key.
|
||||
|
||||
The index structure can provide efficient access to all blocks as well as the ability to determine the cost associated with accessing a given key.
|
||||
Given a key and timestamp, we can determine whether a file contains the block for that timestamp.
|
||||
We can also determine where that block resides and how much data must be read to retrieve the block.
|
||||
Knowing the size of the block, we can efficiently provision our IO statements.
|
||||
|
||||
```
|
||||
+-----------------------------------------------------------------------------+
|
||||
│ Index │
|
||||
+-----------------------------------------------------------------------------+
|
||||
│ Key Len │ Key │ Type │ Count │Min Time │Max Time │ Offset │ Size │...│
|
||||
│ 2 bytes │ N bytes │1 byte│2 bytes│ 8 bytes │ 8 bytes │8 bytes │4 bytes │ │
|
||||
+-----------------------------------------------------------------------------+
|
||||
```
|
||||
|
||||
The last section is the footer that stores the offset of the start of the index.
|
||||
|
||||
```
|
||||
+---------+
|
||||
│ Footer │
|
||||
+---------+
|
||||
│Index Ofs│
|
||||
│ 8 bytes │
|
||||
+---------+
|
||||
```
|
||||
|
||||
### Compression
|
||||
|
||||
Each block is compressed to reduce storage space and disk IO when querying.
|
||||
A block contains the timestamps and values for a given series and field.
|
||||
Each block has one byte header, followed by the compressed timestamps and then the compressed values.
|
||||
|
||||
```
|
||||
+--------------------------------------------------+
|
||||
| Type | Len | Timestamps | Values |
|
||||
|1 Byte | VByte | N Bytes | N Bytes │
|
||||
+--------------------------------------------------+
|
||||
```
|
||||
|
||||
The timestamps and values are compressed and stored separately using encodings dependent on the data type and its shape.
|
||||
Storing them independently allows timestamp encoding to be used for all timestamps, while allowing different encodings for different field types.
|
||||
For example, some points may be able to use run-length encoding whereas other may not.
|
||||
|
||||
Each value type also contains a 1 byte header indicating the type of compression for the remaining bytes.
|
||||
The four high bits store the compression type and the four low bits are used by the encoder if needed.
|
||||
|
||||
#### Timestamps
|
||||
|
||||
Timestamp encoding is adaptive and based on the structure of the timestamps that are encoded.
|
||||
It uses a combination of delta encoding, scaling, and compression using simple8b run-length encoding, as well as falling back to no compression if needed.
|
||||
|
||||
Timestamp resolution is variable but can be as granular as a nanosecond, requiring up to 8 bytes to store uncompressed.
|
||||
During encoding, the values are first delta-encoded.
|
||||
The first value is the starting timestamp and subsequent values are the differences from the prior value.
|
||||
This usually converts the values into much smaller integers that are easier to compress.
|
||||
Many timestamps are also monotonically increasing and fall on even boundaries of time such as every 10s.
|
||||
When timestamps have this structure, they are scaled by the largest common divisor that is also a factor of 10.
|
||||
This has the effect of converting very large integer deltas into smaller ones that compress even better.
|
||||
|
||||
Using these adjusted values, if all the deltas are the same, the time range is stored using run-length encoding.
|
||||
If run-length encoding is not possible and all values are less than (1 << 60) - 1 ([~36.5 years](https://www.wolframalpha.com/input/?i=\(1+%3C%3C+60\)+-+1+nanoseconds+to+years) at nanosecond resolution), then the timestamps are encoded using [simple8b encoding](https://github.com/jwilder/encoding/tree/master/simple8b).
|
||||
Simple8b encoding is a 64bit word-aligned integer encoding that packs multiple integers into a single 64bit word.
|
||||
If any value exceeds the maximum the deltas are stored uncompressed using 8 bytes each for the block.
|
||||
Future encodings may use a patched scheme such as Patched Frame-Of-Reference (PFOR) to handle outliers more effectively.
|
||||
|
||||
#### Floats
|
||||
|
||||
Floats are encoded using an implementation of the [Facebook Gorilla paper](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf).
|
||||
The encoding XORs consecutive values together to produce a small result when the values are close together.
|
||||
The delta is then stored using control bits to indicate how many leading and trailing zeroes are in the XOR value.
|
||||
Our implementation removes the timestamp encoding described in paper and only encodes the float values.
|
||||
|
||||
#### Integers
|
||||
|
||||
Integer encoding uses two different strategies depending on the range of values in the uncompressed data.
|
||||
Encoded values are first encoded using [ZigZag encoding](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers).
|
||||
This interleaves positive and negative integers across a range of positive integers.
|
||||
|
||||
For example, [-2,-1,0,1] becomes [3,1,0,2].
|
||||
See Google's [Protocol Buffers documentation](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers) for more information.
|
||||
|
||||
If all ZigZag encoded values are less than (1 << 60) - 1, they are compressed using simple8b encoding.
|
||||
If any values are larger than the maximum then all values are stored uncompressed in the block.
|
||||
If all values are identical, run-length encoding is used.
|
||||
This works very well for values that are frequently constant.
|
||||
|
||||
#### Booleans
|
||||
|
||||
Booleans are encoded using a simple bit packing strategy where each Boolean uses 1 bit.
|
||||
The number of Booleans encoded is stored using variable-byte encoding at the beginning of the block.
|
||||
|
||||
#### Strings
|
||||
Strings are encoding using [Snappy](http://google.github.io/snappy/) compression.
|
||||
Each string is packed consecutively and they are compressed as one larger block.
|
||||
|
||||
### Compactions
|
||||
|
||||
Compactions are recurring processes that migrate data stored in a write-optimized format into a more read-optimized format.
|
||||
There are a number of stages of compaction that take place while a shard is hot for writes:
|
||||
|
||||
* Snapshots - Values in the Cache and WAL must be converted to TSM files to free memory and disk space used by the WAL segments.
|
||||
These compactions occur based on the cache memory and time thresholds.
|
||||
* Level Compactions - Level compactions (levels 1-4) occur as the TSM files grow.
|
||||
TSM files are compacted from snapshots to level 1 files.
|
||||
Multiple level 1 files are compacted to produce level 2 files.
|
||||
The process continues until files reach level 4 and the max size for a TSM file.
|
||||
They will not be compacted further unless deletes, index optimization compactions, or full compactions need to run.
|
||||
Lower level compactions use strategies that avoid CPU-intensive activities like decompressing and combining blocks.
|
||||
Higher level (and thus less frequent) compactions will re-combine blocks to fully compact them and increase the compression ratio.
|
||||
* Index Optimization - When many level 4 TSM files accumulate, the internal indexes become larger and more costly to access.
|
||||
An index optimization compaction splits the series and indices across a new set of TSM files, sorting all points for a given series into one TSM file.
|
||||
Before an index optimization, each TSM file contained points for most or all series, and thus each contains the same series index.
|
||||
After an index optimization, each TSM file contains points from a minimum of series and there is little series overlap between files.
|
||||
Each TSM file thus has a smaller unique series index, instead of a duplicate of the full series list.
|
||||
In addition, all points from a particular series are contiguous in a TSM file rather than spread across multiple TSM files.
|
||||
* Full Compactions - Full compactions run when a shard has become cold for writes for long time, or when deletes have occurred on the shard.
|
||||
Full compactions produce an optimal set of TSM files and include all optimizations from Level and Index Optimization compactions.
|
||||
Once a shard is fully compacted, no other compactions will run on it unless new writes or deletes are stored.
|
||||
|
||||
### Writes
|
||||
|
||||
Writes are appended to the current WAL segment and are also added to the Cache.
|
||||
Each WAL segment has a maximum size.
|
||||
Writes roll over to a new file once the current file fills up.
|
||||
The cache is also size bounded; snapshots are taken and WAL compactions are initiated when the cache becomes too full.
|
||||
If the inbound write rate exceeds the WAL compaction rate for a sustained period, the cache may become too full, in which case new writes will fail until the snapshot process catches up.
|
||||
|
||||
When WAL segments fill up and are closed, the Compactor snapshots the Cache and writes the data to a new TSM file.
|
||||
When the TSM file is successfully written and `fsync`'d, it is loaded and referenced by the FileStore.
|
||||
|
||||
### Updates
|
||||
|
||||
Updates (writing a newer value for a point that already exists) occur as normal writes.
|
||||
Since cached values overwrite existing values, newer writes take precedence.
|
||||
If a write would overwrite a point in a prior TSM file, the points are merged at query runtime and the newer write takes precedence.
|
||||
|
||||
|
||||
### Deletes
|
||||
|
||||
Deletes occur by writing a delete entry to the WAL for the measurement or series and then updating the Cache and FileStore.
|
||||
The Cache evicts all relevant entries.
|
||||
The FileStore writes a tombstone file for each TSM file that contains relevant data.
|
||||
These tombstone files are used at startup time to ignore blocks as well as during compactions to remove deleted entries.
|
||||
|
||||
Queries against partially deleted series are handled at query time until a compaction removes the data fully from the TSM files.
|
||||
|
||||
### Queries
|
||||
|
||||
When a query is executed by the storage engine, it is essentially a seek to a given time associated with a specific series key and field.
|
||||
First, we do a search on the data files to find the files that contain a time range matching the query as well containing matching series.
|
||||
|
||||
Once we have the data files selected, we next need to find the position in the file of the series key index entries.
|
||||
We run a binary search against each TSM index to find the location of its index blocks.
|
||||
|
||||
In common cases the blocks will not overlap across multiple TSM files and we can search the index entries linearly to find the start block from which to read.
|
||||
If there are overlapping blocks of time, the index entries are sorted to ensure newer writes will take precedence and that blocks can be processed in order during query execution.
|
||||
|
||||
When iterating over the index entries the blocks are read sequentially from the blocks section.
|
||||
The block is decompressed and we seek to the specific point.
|
||||
|
||||
|
||||
# The new InfluxDB storage engine: from LSM Tree to B+Tree and back again to create the Time Structured Merge Tree
|
||||
|
||||
Writing a new storage format should be a last resort.
|
||||
So how did InfluxData end up writing our own engine?
|
||||
InfluxData has experimented with many storage formats and found each lacking in some fundamental way.
|
||||
The performance requirements for InfluxDB are significant, and eventually overwhelm other storage systems.
|
||||
The 0.8 line of InfluxDB allowed multiple storage engines, including LevelDB, RocksDB, HyperLevelDB, and LMDB.
|
||||
The 0.9 line of InfluxDB used BoltDB as the underlying storage engine.
|
||||
This writeup is about the Time Structured Merge Tree storage engine that was released in 0.9.5 and is the only storage engine supported in InfluxDB 0.11+, including the entire 1.x family.
|
||||
|
||||
The properties of the time series data use case make it challenging for many existing storage engines.
|
||||
Over the course of InfluxDB development, InfluxData tried a few of the more popular options.
|
||||
We started with LevelDB, an engine based on LSM Trees, which are optimized for write throughput.
|
||||
After that we tried BoltDB, an engine based on a memory mapped B+Tree, which is optimized for reads.
|
||||
Finally, we ended up building our own storage engine that is similar in many ways to LSM Trees.
|
||||
|
||||
With our new storage engine we were able to achieve up to a 45x reduction in disk space usage from our B+Tree setup with even greater write throughput and compression than what we saw with LevelDB and its variants.
|
||||
This post will cover the details of that evolution and end with an in-depth look at our new storage engine and its inner workings.
|
||||
|
||||
## Properties of time series data
|
||||
|
||||
The workload of time series data is quite different from normal database workloads.
|
||||
There are a number of factors that conspire to make it very difficult to scale and remain performant:
|
||||
|
||||
* Billions of individual data points
|
||||
* High write throughput
|
||||
* High read throughput
|
||||
* Large deletes (data expiration)
|
||||
* Mostly an insert/append workload, very few updates
|
||||
|
||||
The first and most obvious problem is one of scale.
|
||||
In DevOps, IoT, or APM it is easy to collect hundreds of millions or billions of unique data points every day.
|
||||
|
||||
For example, let's say we have 200 VMs or servers running, with each server collecting an average of 100 measurements every 10 seconds.
|
||||
Given there are 86,400 seconds in a day, a single measurement will generate 8,640 points in a day per server.
|
||||
That gives us a total of 172,800,000 (`200 * 100 * 8,640`) individual data points per day.
|
||||
We find similar or larger numbers in sensor data use cases.
|
||||
|
||||
The volume of data means that the write throughput can be very high.
|
||||
We regularly get requests for setups than can handle hundreds of thousands of writes per second.
|
||||
Some larger companies will only consider systems that can handle millions of writes per second.
|
||||
|
||||
At the same time, time series data can be a high read throughput use case.
|
||||
It's true that if you're tracking 700,000 unique metrics or time series you can't hope to visualize all of them.
|
||||
That leads many people to think that you don't actually read most of the data that goes into the database.
|
||||
However, other than dashboards that people have up on their screens, there are automated systems for monitoring or combining the large volume of time series data with other types of data.
|
||||
|
||||
Inside InfluxDB, aggregate functions calculated on the fly may combine tens of thousands of distinct time series into a single view.
|
||||
Each one of those queries must read each aggregated data point, so for InfluxDB the read throughput is often many times higher than the write throughput.
|
||||
|
||||
Given that time series is mostly an append-only workload, you might think that it's possible to get great performance on a B+Tree.
|
||||
Appends in the keyspace are efficient and you can achieve greater than 100,000 per second.
|
||||
However, we have those appends happening in individual time series.
|
||||
So the inserts end up looking more like random inserts than append only inserts.
|
||||
|
||||
One of the biggest problems we found with time series data is that it's very common to delete all data after it gets past a certain age.
|
||||
The common pattern here is that users have high precision data that is kept for a short period of time like a few days or months.
|
||||
Users then downsample and aggregate that data into lower precision rollups that are kept around much longer.
|
||||
|
||||
The naive implementation would be to simply delete each record once it passes its expiration time.
|
||||
However, that means that once the first points written reach their expiration date, the system is processing just as many deletes as writes, which is something most storage engines aren't designed for.
|
||||
|
||||
Let's dig into the details of the two types of storage engines we tried and how these properties had a significant impact on our performance.
|
||||
|
||||
## LevelDB and log structured merge trees
|
||||
|
||||
When the InfluxDB project began, we picked LevelDB as the storage engine because we had used it for time series data storage in the product that was the precursor to InfluxDB.
|
||||
We knew that it had great properties for write throughput and everything seemed to "just work".
|
||||
|
||||
LevelDB is an implementation of a log structured merge tree (LSM tree) that was built as an open source project at Google.
|
||||
It exposes an API for a key-value store where the key space is sorted.
|
||||
This last part is important for time series data as it allowed us to quickly scan ranges of time as long as the timestamp was in the key.
|
||||
|
||||
LSM Trees are based on a log that takes writes and two structures known as Mem Tables and SSTables.
|
||||
These tables represent the sorted keyspace.
|
||||
SSTables are read only files that are continuously replaced by other SSTables that merge inserts and updates into the keyspace.
|
||||
|
||||
The two biggest advantages that LevelDB had for us were high write throughput and built in compression.
|
||||
However, as we learned more about what people needed with time series data, we encountered a few insurmountable challenges.
|
||||
|
||||
The first problem we had was that LevelDB doesn't support hot backups.
|
||||
If you want to do a safe backup of the database, you have to close it and then copy it.
|
||||
The LevelDB variants RocksDB and HyperLevelDB fix this problem, but there was another more pressing problem that we didn't think they could solve.
|
||||
|
||||
Our users needed a way to automatically manage data retention.
|
||||
That meant we needed deletes on a very large scale.
|
||||
In LSM Trees, a delete is as expensive, if not more so, than a write.
|
||||
A delete writes a new record known as a tombstone.
|
||||
After that queries merge the result set with any tombstones to purge the deleted data from the query return.
|
||||
Later, a compaction runs that removes the tombstone record and the underlying deleted record in the SSTable file.
|
||||
|
||||
To get around doing deletes, we split data across what we call shards, which are contiguous blocks of time.
|
||||
Shards would typically hold either one day or seven days worth of data.
|
||||
Each shard mapped to an underlying LevelDB.
|
||||
This meant that we could drop an entire day of data by just closing out the database and removing the underlying files.
|
||||
|
||||
Users of RocksDB may at this point bring up a feature called ColumnFamilies.
|
||||
When putting time series data into Rocks, it's common to split blocks of time into column families and then drop those when their time is up.
|
||||
It's the same general idea: create a separate area where you can just drop files instead of updating indexes when you delete a large block of data.
|
||||
Dropping a column family is a very efficient operation.
|
||||
However, column families are a fairly new feature and we had another use case for shards.
|
||||
|
||||
Organizing data into shards meant that it could be moved within a cluster without having to examine billions of keys.
|
||||
At the time of this writing, it was not possible to move a column family in one RocksDB to another.
|
||||
Old shards are typically cold for writes so moving them around would be cheap and easy.
|
||||
We would have the added benefit of having a spot in the keyspace that is cold for writes so it would be easier to do consistency checks later.
|
||||
|
||||
The organization of data into shards worked great for a while, until a large amount of data went into InfluxDB.
|
||||
LevelDB splits the data out over many small files.
|
||||
Having dozens or hundreds of these databases open in a single process ended up creating a big problem.
|
||||
Users that had six months or a year of data would run out of file handles.
|
||||
It's not something we found with the majority of users, but anyone pushing the database to its limits would hit this problem and we had no fix for it.
|
||||
There were simply too many file handles open.
|
||||
|
||||
## BoltDB and mmap B+Trees
|
||||
|
||||
After struggling with LevelDB and its variants for a year we decided to move over to BoltDB, a pure Golang database heavily inspired by LMDB, a mmap B+Tree database written in C.
|
||||
It has the same API semantics as LevelDB: a key value store where the keyspace is ordered.
|
||||
Many of our users were surprised.
|
||||
Our own posted tests of the LevelDB variants vs. LMDB (a mmap B+Tree) showed RocksDB as the best performer.
|
||||
|
||||
However, there were other considerations that went into this decision outside of the pure write performance.
|
||||
At this point our most important goal was to get to something stable that could be run in production and backed up.
|
||||
BoltDB also had the advantage of being written in pure Go, which simplified our build chain immensely and made it easy to build for other OSes and platforms.
|
||||
|
||||
The biggest win for us was that BoltDB used a single file as the database.
|
||||
At this point our most common source of bug reports were from people running out of file handles.
|
||||
Bolt solved the hot backup problem and the file limit problems all at the same time.
|
||||
|
||||
We were willing to take a hit on write throughput if it meant that we'd have a system that was more reliable and stable that we could build on.
|
||||
Our reasoning was that for anyone pushing really big write loads, they'd be running a cluster anyway.
|
||||
|
||||
We released versions 0.9.0 to 0.9.2 based on BoltDB.
|
||||
From a development perspective it was delightful.
|
||||
Clean API, fast and easy to build in our Go project, and reliable.
|
||||
However, after running for a while we found a big problem with write throughput.
|
||||
After the database got over a few GB, writes would start spiking IOPS.
|
||||
|
||||
Some users were able to get past this by putting InfluxDB on big hardware with near unlimited IOPS.
|
||||
However, most users are on VMs with limited resources in the cloud.
|
||||
We had to figure out a way to reduce the impact of writing a bunch of points into hundreds of thousands of series at a time.
|
||||
|
||||
With the 0.9.3 and 0.9.4 releases our plan was to put a write ahead log (WAL) in front of Bolt.
|
||||
That way we could reduce the number of random insertions into the keyspace.
|
||||
Instead, we'd buffer up multiple writes that were next to each other and then flush them at once.
|
||||
However, that only served to delay the problem.
|
||||
High IOPS still became an issue and it showed up very quickly for anyone operating at even moderate work loads.
|
||||
|
||||
However, our experience building the first WAL implementation in front of Bolt gave us the confidence we needed that the write problem could be solved.
|
||||
The performance of the WAL itself was fantastic, the index simply could not keep up.
|
||||
At this point we started thinking again about how we could create something similar to an LSM Tree that could keep up with our write load.
|
||||
|
||||
Thus was born the Time Structured Merge Tree.
|
|
@ -0,0 +1,51 @@
|
|||
---
|
||||
title: Time Series Index (TSI) overview
|
||||
description: >
|
||||
The Time Series Index (TSI) storage engine supports high cardinality in time series data.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Time Series Index (TSI) overview
|
||||
weight: 70
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
Find overview and background information on Time Series Index (TSI) in this topic. For detail, including how to enable and configure TSI, see [Time Series Index (TSI) details](/enterprise_influxdb/v1.9/concepts/tsi-details/).
|
||||
|
||||
## Overview
|
||||
|
||||
To support a large number of time series, that is, a very high cardinality in the number of unique time series that the database stores, InfluxData has added the new Time Series Index (TSI).
|
||||
InfluxData supports customers using InfluxDB with tens of millions of time series.
|
||||
InfluxData's goal, however, is to expand to hundreds of millions, and eventually billions.
|
||||
Using InfluxData's TSI storage engine, users should be able to have millions of unique time series.
|
||||
The goal is that the number of series should be unbounded by the amount of memory on the server hardware.
|
||||
Importantly, the number of series that exist in the database will have a negligible impact on database startup time.
|
||||
This work represents the most significant technical advancement in the database since InfluxData released the Time Series Merge Tree (TSM) storage engine in 2016.
|
||||
|
||||
## Background information
|
||||
|
||||
InfluxDB actually looks like two databases in one, a time series data store and an inverted index for the measurement, tag, and field metadata.
|
||||
|
||||
### Time-Structured Merge Tree (TSM)
|
||||
|
||||
The Time-Structured Merge Tree (TSM) engine solves the problem of getting maximum throughput, compression, and query speed for raw time series data.
|
||||
Up until TSI, the inverted index was an in-memory data structure that was built during startup of the database based on the data in TSM.
|
||||
This meant that for every measurement, tag key-value pair, and field name, there was a lookup table in-memory to map those bits of metadata to an underlying time series.
|
||||
For users with a high number of ephemeral series, memory utilization continued increasing as new time series were created.
|
||||
And, startup times increased since all of that data would have to be loaded onto the heap at start time.
|
||||
|
||||
> For details, see [TSM-based data storage and in-memory indexing](/enterprise_influxdb/v1.9/concepts/storage_engine/).
|
||||
|
||||
### Time Series Index (TSI)
|
||||
|
||||
The new time series index (TSI) moves the index to files on disk that we memory map.
|
||||
This means that we let the operating system handle being the Least Recently Used (LRU) memory.
|
||||
Much like the TSM engine for raw time series data we have a write-ahead log with an in-memory structure that gets merged at query time with the memory-mapped index.
|
||||
Background routines run constantly to compact the index into larger and larger files to avoid having to do too many index merges at query time.
|
||||
Under the covers, we’re using techniques like Robin Hood Hashing to do fast index lookups and HyperLogLog++ to keep sketches of cardinality estimates.
|
||||
The latter will give us the ability to add things to the query languages like the [SHOW CARDINALITY](/enterprise_influxdb/v1.9/query_language/spec#show-cardinality) queries.
|
||||
|
||||
### Issues solved by TSI and remaining to be solved
|
||||
|
||||
The primary issue that Time Series Index (TSI) addresses is ephemeral time series. Most frequently, this occurs in use cases that want to track per process metrics or per container metrics by putting identifiers in tags. For example, the [Heapster project for Kubernetes](https://github.com/kubernetes/heapster) does this. For series that are no longer hot for writes or queries, they won’t take up space in memory.
|
||||
|
||||
The issue that the Heapster project and similar use cases did not address is limiting the scope of data returned by the SHOW queries. We’ll have updates to the query language in the future to limit those results by time. We also don’t solve the problem of having all these series hot for reads and writes. For that problem, scale-out clustering is the solution. We’ll have to continue to optimize the query language and engine to work with large sets of series. We’ll need to add guard rails and limits into the language and eventually, add spill-to-disk query processing. That work will be on-going in every release of InfluxDB.
|
|
@ -0,0 +1,172 @@
|
|||
---
|
||||
title: Time Series Index (TSI) details
|
||||
description: Enable and understand the Time Series Index (TSI).
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Time Series Index (TSI) details
|
||||
weight: 80
|
||||
parent: Concepts
|
||||
---
|
||||
|
||||
When InfluxDB ingests data, we store not only the value but we also index the measurement and tag information so that it can be queried quickly.
|
||||
In earlier versions, index data could only be stored in-memory, however, that requires a lot of RAM and places an upper bound on the number of series a machine can hold.
|
||||
This upper bound is usually somewhere between 1 - 4 million series depending on the machine used.
|
||||
|
||||
The Time Series Index (TSI) was developed to allow us to go past that upper bound.
|
||||
TSI stores index data on disk so that we are no longer restricted by RAM.
|
||||
TSI uses the operating system's page cache to pull hot data into memory and let cold data rest on disk.
|
||||
|
||||
## Enable TSI
|
||||
|
||||
To enable TSI, set the following line in the InfluxDB configuration file (`influxdb.conf`):
|
||||
|
||||
```
|
||||
index-version = "tsi1"
|
||||
```
|
||||
|
||||
(Be sure to include the double quotes.)
|
||||
|
||||
### InfluxDB Enterprise
|
||||
|
||||
- To convert your data nodes to support TSI, see [Upgrade InfluxDB Enterprise clusters](/enterprise_influxdb/v1.8/administration/upgrading/).
|
||||
|
||||
- For detail on configuration, see [Configure InfluxDB Enterprise clusters](/enterprise_influxdb/v1.8/administration/configuration/).
|
||||
|
||||
### InfluxDB OSS
|
||||
|
||||
- For detail on configuration, see [Configuring InfluxDB OSS](/enterprise_influxdb/v1.9/administration/config/).
|
||||
|
||||
## Tooling
|
||||
|
||||
### `influx_inspect dumptsi`
|
||||
|
||||
If you are troubleshooting an issue with an index, you can use the `influx_inspect dumptsi` command.
|
||||
This command allows you to print summary statistics on an index, file, or a set of files.
|
||||
This command only works on one index at a time.
|
||||
|
||||
For details on this command, see [influx_inspect dumptsi](/enterprise_influxdb/v1.9/tools/influx_inspect/#dumptsi).
|
||||
|
||||
### `influx_inspect buildtsi`
|
||||
|
||||
If you want to convert an existing shard from an in-memory index to a TSI index, or if you have an existing TSI index which has become corrupt, you can use the `buildtsi` command to create the index from the underlying TSM data.
|
||||
If you have an existing TSI index that you want to rebuild, first delete the `index` directory within your shard.
|
||||
|
||||
This command works at the server-level but you can optionally add database, retention policy and shard filters to only apply to a subset of shards.
|
||||
|
||||
For details on this command, see [influx inspect buildtsi](/enterprise_influxdb/v1.9/tools/influx_inspect/#buildtsi).
|
||||
|
||||
|
||||
## Understanding TSI
|
||||
|
||||
### File organization
|
||||
|
||||
TSI (Time Series Index) is a log-structured merge tree-based database for InfluxDB series data.
|
||||
TSI is composed of several parts:
|
||||
|
||||
* **Index**: Contains the entire index dataset for a single shard.
|
||||
|
||||
* **Partition**: Contains a sharded partition of the data for a shard.
|
||||
|
||||
* **LogFile**: Contains newly written series as an in-memory index and is persisted as a WAL.
|
||||
|
||||
* **IndexFile**: Contains an immutable, memory-mapped index built from a LogFile or merged from two contiguous index files.
|
||||
|
||||
There is also a **SeriesFile** which contains a set of all series keys across the entire database.
|
||||
Each shard within the database shares the same series file.
|
||||
|
||||
### Writes
|
||||
|
||||
The following occurs when a write comes into the system:
|
||||
|
||||
1. Series is added to the series file or is looked up if it already exists. This returns an auto-incrementing series ID.
|
||||
2. The series is sent to the Index. The index maintains a roaring bitmap of existing series IDs and ignores series that have already been created.
|
||||
3. The series is hashed and sent to the appropriate Partition.
|
||||
4. The Partition writes the series as an entry to the LogFile.
|
||||
5. The LogFile writes the series to a write-ahead log file on disk and adds the series to a set of in-memory indexes.
|
||||
|
||||
### Compaction
|
||||
|
||||
Once the LogFile exceeds a threshold (5MB), then a new active log file is created and the previous one begins compacting into an IndexFile.
|
||||
This first index file is at level 1 (L1).
|
||||
The log file is considered level 0 (L0).
|
||||
|
||||
Index files can also be created by merging two smaller index files together.
|
||||
For example, if contiguous two L1 index files exist then they can be merged into an L2 index file.
|
||||
|
||||
### Reads
|
||||
|
||||
The index provides several API calls for retrieving sets of data such as:
|
||||
|
||||
* `MeasurementIterator()`: Returns a sorted list of measurement names.
|
||||
* `TagKeyIterator()`: Returns a sorted list of tag keys in a measurement.
|
||||
* `TagValueIterator()`: Returns a sorted list of tag values for a tag key.
|
||||
* `MeasurementSeriesIDIterator()`: Returns a sorted list of all series IDs for a measurement.
|
||||
* `TagKeySeriesIDIterator()`: Returns a sorted list of all series IDs for a tag key.
|
||||
* `TagValueSeriesIDIterator()`: Returns a sorted list of all series IDs for a tag value.
|
||||
|
||||
These iterators are all composable using several merge iterators.
|
||||
For each type of iterator (measurement, tag key, tag value, series id), there are multiple merge iterator types:
|
||||
|
||||
* **Merge**: Deduplicates items from two iterators.
|
||||
* **Intersect**: Returns only items that exist in two iterators.
|
||||
* **Difference**: Only returns items from first iterator that don't exist in the second iterator.
|
||||
|
||||
For example, a query with a WHERE clause of `region != 'us-west'` that operates across two shards will construct a set of iterators like this:
|
||||
|
||||
```
|
||||
DifferenceSeriesIDIterators(
|
||||
MergeSeriesIDIterators(
|
||||
Shard1.MeasurementSeriesIDIterator("m"),
|
||||
Shard2.MeasurementSeriesIDIterator("m"),
|
||||
),
|
||||
MergeSeriesIDIterators(
|
||||
Shard1.TagValueSeriesIDIterator("m", "region", "us-west"),
|
||||
Shard2.TagValueSeriesIDIterator("m", "region", "us-west"),
|
||||
),
|
||||
)
|
||||
```
|
||||
|
||||
### Log File Structure
|
||||
|
||||
The log file is simply structured as a list of LogEntry objects written to disk in sequential order. Log files are written until they reach 5MB and then they are compacted into index files.
|
||||
The entry objects in the log can be of any of the following types:
|
||||
|
||||
* AddSeries
|
||||
* DeleteSeries
|
||||
* DeleteMeasurement
|
||||
* DeleteTagKey
|
||||
* DeleteTagValue
|
||||
|
||||
The in-memory index on the log file tracks the following:
|
||||
|
||||
* Measurements by name
|
||||
* Tag keys by measurement
|
||||
* Tag values by tag key
|
||||
* Series by measurement
|
||||
* Series by tag value
|
||||
* Tombstones for series, measurements, tag keys, and tag values.
|
||||
|
||||
The log file also maintains bitsets for series ID existence and tombstones.
|
||||
These bitsets are merged with other log files and index files to regenerate the full index bitset on startup.
|
||||
|
||||
### Index File Structure
|
||||
|
||||
The index file is an immutable file that tracks similar information to the log file, but all data is indexed and written to disk so that it can be directly accessed from a memory-map.
|
||||
|
||||
The index file has the following sections:
|
||||
|
||||
* **TagBlocks:** Maintains an index of tag values for a single tag key.
|
||||
* **MeasurementBlock:** Maintains an index of measurements and their tag keys.
|
||||
* **Trailer:** Stores offset information for the file as well as HyperLogLog sketches for cardinality estimation.
|
||||
|
||||
### Manifest
|
||||
|
||||
The MANIFEST file is stored in the index directory and lists all the files that belong to the index and the order in which they should be accessed.
|
||||
This file is updated every time a compaction occurs.
|
||||
Any files that are in the directory that are not in the index file are index files that are in the process of being compacted.
|
||||
|
||||
### FileSet
|
||||
|
||||
A file set is an in-memory snapshot of the manifest that is obtained while the InfluxDB process is running.
|
||||
This is required to provide a consistent view of the index at a point-in-time.
|
||||
The file set also facilitates reference counting for all of its files so that no file will be deleted via compaction until all readers of the file are done with it.
|
|
@ -0,0 +1,12 @@
|
|||
---
|
||||
title: InfluxDB Enterprise features
|
||||
description: Users, clustering, and other InfluxDB Enterprise features.
|
||||
aliases:
|
||||
- /enterprise/v1.8/features/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Enterprise features
|
||||
weight: 60
|
||||
---
|
||||
|
||||
{{< children hlevel="h2" >}}
|
|
@ -0,0 +1,128 @@
|
|||
---
|
||||
title: InfluxDB Enterprise cluster features
|
||||
description: Overview of features related to InfluxDB Enterprise clustering.
|
||||
aliases:
|
||||
- /enterprise/v1.8/features/clustering-features/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Cluster features
|
||||
weight: 20
|
||||
parent: Enterprise features
|
||||
---
|
||||
|
||||
## Entitlements
|
||||
|
||||
A valid license key is required in order to start `influxd-meta` or `influxd`.
|
||||
License keys restrict the number of data nodes that can be added to a cluster as well as the number of CPU cores a data node can use.
|
||||
Without a valid license, the process will abort startup.
|
||||
|
||||
Access your license expiration date with the `/debug/vars` endpoint.
|
||||
|
||||
{{< keep-url >}}
|
||||
```sh
|
||||
$ curl http://localhost:8086/debug/vars | jq '.entitlements'
|
||||
{
|
||||
"name": "entitlements",
|
||||
"tags": null,
|
||||
"values": {
|
||||
"licenseExpiry": "2022-02-15T00:00:00Z",
|
||||
"licenseType": "license-key"
|
||||
}
|
||||
}
|
||||
```
|
||||
{{% caption %}}
|
||||
This examples uses `curl` and [`jq`](https://stedolan.github.io/jq/).
|
||||
{{% /caption %}}
|
||||
|
||||
## Query management
|
||||
|
||||
Query management works cluster wide. Specifically, `SHOW QUERIES` and `KILL QUERY <ID>` on `"<host>"` can be run on any data node. `SHOW QUERIES` will report all queries running across the cluster and the node which is running the query.
|
||||
`KILL QUERY` can abort queries running on the local node or any other remote data node. For details on using the `SHOW QUERIES` and `KILL QUERY` on InfluxDB Enterprise clusters,
|
||||
see [Query Management](/enterprise_influxdb/v1.9/troubleshooting/query_management/).
|
||||
|
||||
## Subscriptions
|
||||
|
||||
Subscriptions used by Kapacitor work in a cluster. Writes to any node will be forwarded to subscribers across all supported subscription protocols.
|
||||
|
||||
## Continuous queries
|
||||
|
||||
### Configuration and operational considerations on a cluster
|
||||
|
||||
It is important to understand how to configure InfluxDB Enterprise and how this impacts the continuous queries (CQ) engine’s behavior:
|
||||
|
||||
- **Data node configuration** `[continuous queries]`
|
||||
[run-interval](/enterprise_influxdb/v1.9/administration/config-data-nodes#run-interval-1s)
|
||||
-- The interval at which InfluxDB checks to see if a CQ needs to run. Set this option to the lowest interval
|
||||
at which your CQs run. For example, if your most frequent CQ runs every minute, set run-interval to 1m.
|
||||
- **Meta node configuration** `[meta]`
|
||||
[lease-duration](/enterprise_influxdb/v1.9/administration/config-meta-nodes#lease-duration-1m0s)
|
||||
-- The default duration of the leases that data nodes acquire from the meta nodes. Leases automatically expire after the
|
||||
lease-duration is met. Leases ensure that only one data node is running something at a given time. For example, Continuous
|
||||
Queries use a lease so that all data nodes aren’t running the same CQs at once.
|
||||
- **Execution time of CQs** – CQs are sequentially executed. Depending on the amount of work that they need to accomplish
|
||||
in order to complete, the configuration parameters mentioned above can have an impact on the observed behavior of CQs.
|
||||
|
||||
The CQ service is running on every node, but only a single node is granted exclusive access to execute CQs at any one time.
|
||||
However, every time the `run-interval` elapses (and assuming a node isn't currently executing CQs), a node attempts to
|
||||
acquire the CQ lease. By default the `run-interval` is one second – so the data nodes are aggressively checking to see
|
||||
if they can acquire the lease. On clusters where all CQs execute in an amount of time less than `lease-duration`
|
||||
(default is 1m), there's a good chance that the first data node to acquire the lease will still hold the lease when
|
||||
the `run-interval` elapses. Other nodes will be denied the lease and when the node holding the lease requests it again,
|
||||
the lease is renewed with the expiration extended to `lease-duration`. So in a typical situation, we observe that a
|
||||
single data node acquires the CQ lease and holds on to it. It effectively becomes the executor of CQs until it is
|
||||
recycled (for any reason).
|
||||
|
||||
Now consider the the following case, CQs take longer to execute than the `lease-duration`, so when the lease expires,
|
||||
~1 second later another data node requests and is granted the lease. The original holder of the lease is busily working
|
||||
on sequentially executing the list of CQs it was originally handed and the data node now holding the lease begins
|
||||
executing CQs from the top of the list.
|
||||
|
||||
Based on this scenario, it may appear that CQs are “executing in parallel” because multiple data nodes are
|
||||
essentially “rolling” sequentially through the registered CQs and the lease is rolling from node to node.
|
||||
The “long pole” here is effectively your most complex CQ – and it likely means that at some point all nodes
|
||||
are attempting to execute that same complex CQ (and likely competing for resources as they overwrite points
|
||||
generated by that query on each node that is executing it --- likely with some phased offset).
|
||||
|
||||
To avoid this behavior, and this is desirable because it reduces the overall load on your cluster,
|
||||
you should set the lease-duration to a value greater than the aggregate execution time for ALL the CQs that you are running.
|
||||
|
||||
Based on the current way in which CQs are configured to execute, the way to address parallelism is by using
|
||||
Kapacitor for the more complex CQs that you are attempting to run.
|
||||
[See Kapacitor as a continuous query engine](/{{< latest "kapacitor" >}}/guides/continuous_queries/).
|
||||
However, you can keep the more simplistic and highly performant CQs within the database –
|
||||
but ensure that the lease duration is greater than their aggregate execution time to ensure that
|
||||
“extra” load is not being unnecessarily introduced on your cluster.
|
||||
|
||||
|
||||
## PProf endpoints
|
||||
|
||||
Meta nodes expose the `/debug/pprof` endpoints for profiling and troubleshooting.
|
||||
|
||||
## Shard movement
|
||||
|
||||
* [Copy shard](/enterprise_influxdb/v1.9/tools/influxd-ctl/#copy-shard) support - copy a shard from one node to another
|
||||
* [Copy shard status](/enterprise_influxdb/v1.9/tools/influxd-ctl/#copy-shard-status) - query the status of a copy shard request
|
||||
* [Kill copy shard](/enterprise_influxdb/v1.9/tools/influxd-ctl/#kill-copy-shard) - kill a running shard copy
|
||||
* [Remove shard](/enterprise_influxdb/v1.9/tools/influxd-ctl/#remove-shard) - remove a shard from a node (this deletes data)
|
||||
* [Truncate shards](/enterprise_influxdb/v1.9/tools/influxd-ctl/#truncate-shards) - truncate all active shard groups and start new shards immediately (This is useful when adding nodes or changing replication factors.)
|
||||
|
||||
This functionality is exposed via an API on the meta service and through [`influxd-ctl` sub-commands](/enterprise_influxdb/v1.9/tools/influxd-ctl/).
|
||||
|
||||
## OSS conversion
|
||||
|
||||
Importing a OSS single server as the first data node is supported.
|
||||
|
||||
See [OSS to cluster migration](/enterprise_influxdb/v1.9/guides/migration/) for
|
||||
step-by-step instructions.
|
||||
|
||||
## Query routing
|
||||
|
||||
The query engine skips failed nodes that hold a shard needed for queries.
|
||||
If there is a replica on another node, it will retry on that node.
|
||||
|
||||
## Backup and restore
|
||||
|
||||
InfluxDB Enterprise clusters support backup and restore functionality starting with
|
||||
version 0.7.1.
|
||||
See [Backup and restore](/enterprise_influxdb/v1.9/administration/backup-and-restore/) for
|
||||
more information.
|
|
@ -0,0 +1,197 @@
|
|||
---
|
||||
title: InfluxDB Enterprise users
|
||||
description: Overview of users in InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/features/users/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 0
|
||||
parent: Enterprise features
|
||||
---
|
||||
|
||||
InfluxDB Enterprise users have functions that are either specific to the web
|
||||
console or specific to the cluster:
|
||||
|
||||
```
|
||||
Users Cluster Permissions
|
||||
|
||||
Penelope
|
||||
O
|
||||
\|/
|
||||
| ----------------------> Dev Account --------> Manage Queries
|
||||
/ \ --------> Monitor
|
||||
--------> Add/Remove Nodes
|
||||
Jim
|
||||
O
|
||||
\|/
|
||||
| ----------------------> Marketing Account ---> View Admin
|
||||
/ \ ---> Graph Role ---> Read
|
||||
---> View Chronograf
|
||||
```
|
||||
|
||||
## Cluster user information
|
||||
|
||||
In the cluster, individual users are assigned to an account.
|
||||
Cluster accounts have permissions and roles.
|
||||
|
||||
In the diagram above, Penelope is assigned to the Dev Account and
|
||||
Jim is assigned to the Marketing Account.
|
||||
The Dev Account includes the permissions to manage queries, monitor the
|
||||
cluster, and add/remove nodes from the cluster.
|
||||
The Marketing Account includes the permission to view and edit the admin screens
|
||||
as well as the Graph Role which contains the permissions to read data and
|
||||
view Chronograf.
|
||||
|
||||
### Roles
|
||||
|
||||
Roles are groups of permissions.
|
||||
A single role can belong to several cluster accounts.
|
||||
|
||||
InfluxDB Enterprise clusters have two built-in roles:
|
||||
|
||||
#### Global Admin
|
||||
|
||||
The Global Admin role has all 16 [cluster permissions](#permissions).
|
||||
|
||||
#### Admin
|
||||
|
||||
The Admin role has all [cluster permissions](#permissions) except for the
|
||||
permissions to:
|
||||
|
||||
* Add/Remove Nodes
|
||||
* Copy Shard
|
||||
* Manage Shards
|
||||
* Rebalance
|
||||
|
||||
### Permissions
|
||||
|
||||
InfluxDB Enterprise clusters have 16 permissions:
|
||||
|
||||
#### View Admin
|
||||
|
||||
Permission to view or edit admin screens.
|
||||
|
||||
#### View Chronograf
|
||||
|
||||
Permission to use Chronograf tools.
|
||||
|
||||
#### Create Databases
|
||||
|
||||
Permission to create databases.
|
||||
|
||||
#### Create Users & Roles
|
||||
|
||||
Permission to create users and roles.
|
||||
|
||||
#### Add/Remove nodes
|
||||
|
||||
Permission to add/remove nodes from a cluster.
|
||||
|
||||
#### Drop Databases
|
||||
|
||||
Permission to drop databases.
|
||||
|
||||
#### Drop Data
|
||||
|
||||
Permission to drop measurements and series.
|
||||
|
||||
#### Read
|
||||
|
||||
Permission to read data.
|
||||
|
||||
#### Write
|
||||
|
||||
Permission to write data.
|
||||
|
||||
#### Rebalance
|
||||
|
||||
Permission to rebalance a cluster.
|
||||
|
||||
#### Manage Shards
|
||||
|
||||
Permission to copy and delete shards.
|
||||
|
||||
#### Manage continuous queries
|
||||
|
||||
Permission to create, show, and drop continuous queries.
|
||||
|
||||
#### Manage Queries
|
||||
|
||||
Permission to show and kill queries.
|
||||
|
||||
#### Manage Subscriptions
|
||||
|
||||
Permission to show, add, and drop subscriptions.
|
||||
|
||||
#### Monitor
|
||||
|
||||
Permission to show stats and diagnostics.
|
||||
|
||||
#### Copy Shard
|
||||
|
||||
Permission to copy shards.
|
||||
|
||||
### Permission to Statement
|
||||
|
||||
The following table describes permissions required to execute the associated database statement. It also describes whether these permissions apply just to InfluxDB (Database) or InfluxDB Enterprise (Cluster).
|
||||
|
||||
|Permission|Statement|
|
||||
|---|---|
|
||||
|CreateDatabasePermission|AlterRetentionPolicyStatement, CreateDatabaseStatement, CreateRetentionPolicyStatement, ShowRetentionPoliciesStatement|
|
||||
|ManageContinuousQueryPermission|CreateContinuousQueryStatement, DropContinuousQueryStatement, ShowContinuousQueriesStatement|
|
||||
|ManageSubscriptionPermission|CreateSubscriptionStatement, DropSubscriptionStatement, ShowSubscriptionsStatement|
|
||||
|CreateUserAndRolePermission|CreateUserStatement, DropUserStatement, GrantAdminStatement, GrantStatement, RevokeAdminStatement, RevokeStatement, SetPasswordUserStatement, ShowGrantsForUserStatement, ShowUsersStatement|
|
||||
|DropDataPermission|DeleteSeriesStatement, DeleteStatement, DropMeasurementStatement, DropSeriesStatement|
|
||||
|DropDatabasePermission|DropDatabaseStatement, DropRetentionPolicyStatement|
|
||||
|ManageShardPermission|DropShardStatement,ShowShardGroupsStatement, ShowShardsStatement|
|
||||
|ManageQueryPermission|KillQueryStatement, ShowQueriesStatement|
|
||||
|MonitorPermission|ShowDiagnosticsStatement, ShowStatsStatement|
|
||||
|ReadDataPermission|ShowFieldKeysStatement, ShowMeasurementsStatement, ShowSeriesStatement, ShowTagKeysStatement, ShowTagValuesStatement, ShowRetentionPoliciesStatement|
|
||||
|NoPermissions|ShowDatabasesStatement|
|
||||
|Determined by type of select statement|SelectStatement|
|
||||
|
||||
### Statement to Permission
|
||||
|
||||
The following table describes database statements and the permissions required to execute them. It also describes whether these permissions apply just to InfluxDB (Database) or InfluxDB Enterprise (Cluster).
|
||||
|
||||
|Statment|Permissions|Scope|
|
||||
|---|---|---|
|
||||
|AlterRetentionPolicyStatement|CreateDatabasePermission|Database|
|
||||
|CreateContinuousQueryStatement|ManageContinuousQueryPermission|Database|
|
||||
|CreateDatabaseStatement|CreateDatabasePermission|Cluster|
|
||||
|CreateRetentionPolicyStatement|CreateDatabasePermission|Database|
|
||||
|CreateSubscriptionStatement|ManageSubscriptionPermission|Database|
|
||||
|CreateUserStatement|CreateUserAndRolePermission|Database|
|
||||
|DeleteSeriesStatement|DropDataPermission|Database|
|
||||
|DeleteStatement|DropDataPermission|Database|
|
||||
|DropContinuousQueryStatement|ManageContinuousQueryPermission|Database|
|
||||
|DropDatabaseStatement|DropDatabasePermission|Cluster|
|
||||
|DropMeasurementStatement|DropDataPermission|Database|
|
||||
|DropRetentionPolicyStatement|DropDatabasePermission|Database|
|
||||
|DropSeriesStatement|DropDataPermission|Database|
|
||||
|DropShardStatement|ManageShardPermission|Cluster|
|
||||
|DropSubscriptionStatement|ManageSubscriptionPermission|Database|
|
||||
|DropUserStatement|CreateUserAndRolePermission|Database|
|
||||
|GrantAdminStatement|CreateUserAndRolePermission|Database|
|
||||
|GrantStatement|CreateUserAndRolePermission|Database|
|
||||
|KillQueryStatement|ManageQueryPermission|Database|
|
||||
|RevokeAdminStatement|CreateUserAndRolePermission|Database|
|
||||
|RevokeStatement|CreateUserAndRolePermission|Database|
|
||||
|SelectStatement|Determined by type of select statement|n/a|
|
||||
|SetPasswordUserStatement|CreateUserAndRolePermission|Database|
|
||||
|ShowContinuousQueriesStatement|ManageContinuousQueryPermission|Database|
|
||||
|ShowDatabasesStatement|NoPermissions|Cluster|The user's grants determine which databases are returned in the results.|
|
||||
|ShowDiagnosticsStatement|MonitorPermission|Database|
|
||||
|ShowFieldKeysStatement|ReadDataPermission|Database|
|
||||
|ShowGrantsForUserStatement|CreateUserAndRolePermission|Database|
|
||||
|ShowMeasurementsStatement|ReadDataPermission|Database|
|
||||
|ShowQueriesStatement|ManageQueryPermission|Database|
|
||||
|ShowRetentionPoliciesStatement|CreateDatabasePermission|Database|
|
||||
|ShowSeriesStatement|ReadDataPermission|Database|
|
||||
|ShowShardGroupsStatement|ManageShardPermission|Cluster|
|
||||
|ShowShardsStatement|ManageShardPermission|Cluster|
|
||||
|ShowStatsStatement|MonitorPermission|Database|
|
||||
|ShowSubscriptionsStatement|ManageSubscriptionPermission|Database|
|
||||
|ShowTagKeysStatement|ReadDataPermission|Database|
|
||||
|ShowTagValuesStatement|ReadDataPermission|Database|
|
||||
|ShowUsersStatement|CreateUserAndRolePermission|Database|
|
|
@ -0,0 +1,38 @@
|
|||
---
|
||||
title: Flux data scripting language
|
||||
description: >
|
||||
Flux is a functional data scripting language designed for querying, analyzing, and acting on time series data.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Flux
|
||||
weight: 71
|
||||
v2: /influxdb/v2.0/query-data/get-started/
|
||||
---
|
||||
|
||||
Flux is a functional data scripting language designed for querying, analyzing, and acting on time series data.
|
||||
Its takes the power of [InfluxQL](/enterprise_influxdb/v1.9/query_language/spec/) and the functionality of [TICKscript](/{{< latest "kapacitor" >}}/tick/introduction/) and combines them into a single, unified syntax.
|
||||
|
||||
> Flux v0.65 is production-ready and included with [InfluxDB v1.8](/enterprise_influxdb/v1.9).
|
||||
> The InfluxDB v1.8 implementation of Flux is read-only and does not support
|
||||
> writing data back to InfluxDB.
|
||||
|
||||
## Flux design principles
|
||||
Flux is designed to be usable, readable, flexible, composable, testable, contributable, and shareable.
|
||||
Its syntax is largely inspired by [2018's most popular scripting language](https://insights.stackoverflow.com/survey/2018#technology),
|
||||
Javascript, and takes a functional approach to data exploration and processing.
|
||||
|
||||
The following example illustrates pulling data from a bucket (similar to an InfluxQL database) for the last five minutes,
|
||||
filtering that data by the `cpu` measurement and the `cpu=cpu-total` tag, windowing the data in 1 minute intervals,
|
||||
and calculating the average of each window:
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start:-1h)
|
||||
|> filter(fn:(r) =>
|
||||
r._measurement == "cpu" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> aggregateWindow(every: 1m, fn: mean)
|
||||
```
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,419 @@
|
|||
---
|
||||
title: Flux vs InfluxQL
|
||||
description:
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Flux vs InfluxQL
|
||||
parent: Flux
|
||||
weight: 5
|
||||
---
|
||||
|
||||
Flux is an alternative to [InfluxQL](/enterprise_influxdb/v1.9/query_language/) and other SQL-like query languages for querying and analyzing data.
|
||||
Flux uses functional language patterns making it incredibly powerful, flexible, and able to overcome many of the limitations of InfluxQL.
|
||||
This article outlines many of the tasks possible with Flux but not InfluxQL and provides information about Flux and InfluxQL parity.
|
||||
|
||||
- [Possible with Flux](#possible-with-flux)
|
||||
- [InfluxQL and Flux parity](#influxql-and-flux-parity)
|
||||
|
||||
## Possible with Flux
|
||||
|
||||
- [Joins](#joins)
|
||||
- [Math across measurements](#math-across-measurements)
|
||||
- [Sort by tags](#sort-by-tags)
|
||||
- [Group by any column](#group-by-any-column)
|
||||
- [Window by calendar months and years](#window-by-calendar-months-and-years)
|
||||
- [Work with multiple data sources](#work-with-multiple-data-sources)
|
||||
- [DatePart-like queries](#datepart-like-queries)
|
||||
- [Pivot](#pivot)
|
||||
- [Histograms](#histograms)
|
||||
- [Covariance](#covariance)
|
||||
- [Cast booleans to integers](#cast-booleans-to-integers)
|
||||
- [String manipulation and data shaping](#string-manipulation-and-data-shaping)
|
||||
- [Work with geo-temporal data](#work-with-geo-temporal-data)
|
||||
|
||||
### Joins
|
||||
InfluxQL has never supported joins. They can be accomplished using [TICKscript](/{{< latest "kapacitor" >}}/tick/introduction/),
|
||||
but even TICKscript's join capabilities are limited.
|
||||
Flux's [`join()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/join/) allows you
|
||||
to join data **from any bucket, any measurement, and on any columns** as long as
|
||||
each data set includes the columns on which they are to be joined.
|
||||
This opens the door for really powerful and useful operations.
|
||||
|
||||
```js
|
||||
dataStream1 = from(bucket: "bucket1")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "network" and
|
||||
r._field == "bytes-transferred"
|
||||
)
|
||||
|
||||
dataStream2 = from(bucket: "bucket1")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "httpd" and
|
||||
r._field == "requests-per-sec"
|
||||
)
|
||||
|
||||
join(
|
||||
tables: {d1:dataStream1, d2:dataStream2},
|
||||
on: ["_time", "_stop", "_start", "host"]
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
_For an in-depth walkthrough of using the `join()` function, see [How to join data with Flux](/enterprise_influxdb/v1.9/flux/guides/join)._
|
||||
|
||||
---
|
||||
|
||||
### Math across measurements
|
||||
Being able to perform cross-measurement joins also allows you to run calculations using
|
||||
data from separate measurements – a highly requested feature from the InfluxData community.
|
||||
The example below takes two data streams from separate measurements, `mem` and `processes`,
|
||||
joins them, then calculates the average amount of memory used per running process:
|
||||
|
||||
```js
|
||||
// Memory used (in bytes)
|
||||
memUsed = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used"
|
||||
)
|
||||
|
||||
// Total processes running
|
||||
procTotal = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "processes" and
|
||||
r._field == "total"
|
||||
)
|
||||
|
||||
// Join memory used with total processes and calculate
|
||||
// the average memory (in MB) used for running processes.
|
||||
join(
|
||||
tables: {mem:memUsed, proc:procTotal},
|
||||
on: ["_time", "_stop", "_start", "host"]
|
||||
)
|
||||
|> map(fn: (r) => ({
|
||||
_time: r._time,
|
||||
_value: (r._value_mem / r._value_proc) / 1000000
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
### Sort by tags
|
||||
InfluxQL's sorting capabilities are very limited, allowing you only to control the
|
||||
sort order of `time` using the `ORDER BY time` clause.
|
||||
Flux's [`sort()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/sort) sorts records based on list of columns.
|
||||
Depending on the column type, records are sorted lexicographically, numerically, or chronologically.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start:-12h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "system" and
|
||||
r._field == "uptime"
|
||||
)
|
||||
|> sort(columns:["region", "host", "_value"])
|
||||
```
|
||||
|
||||
### Group by any column
|
||||
InfluxQL lets you group by tags or by time intervals, but nothing else.
|
||||
Flux lets you group by any column in the dataset, including `_value`.
|
||||
Use the Flux [`group()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/group/)
|
||||
to define which columns to group data by.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start:-12h)
|
||||
|> filter(fn: (r) => r._measurement == "system" and r._field == "uptime" )
|
||||
|> group(columns:["host", "_value"])
|
||||
```
|
||||
|
||||
### Window by calendar months and years
|
||||
InfluxQL does not support windowing data by calendar months and years due to their varied lengths.
|
||||
Flux supports calendar month and year duration units (`1mo`, `1y`) and lets you
|
||||
window and aggregate data by calendar month and year.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start:-1y)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
|
||||
|> aggregateWindow(every: 1mo, fn: mean)
|
||||
```
|
||||
|
||||
### Work with multiple data sources
|
||||
InfluxQL can only query data stored in InfluxDB.
|
||||
Flux can query data from other data sources such as CSV, PostgreSQL, MySQL, Google BigTable, and more.
|
||||
Join that data with data in InfluxDB to enrich query results.
|
||||
|
||||
- [Flux CSV package](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/csv/)
|
||||
- [Flux SQL package](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/sql/)
|
||||
- [Flux BigTable package](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/bigtable/)
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
import "csv"
|
||||
import "sql"
|
||||
|
||||
csvData = csv.from(csv: rawCSV)
|
||||
sqlData = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://user:password@localhost",
|
||||
query:"SELECT * FROM example_table"
|
||||
)
|
||||
data = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -24h)
|
||||
|> filter(fn: (r) => r._measurement == "sensor")
|
||||
|
||||
auxData = join(tables: {csv: csvData, sql: sqlData}, on: ["sensor_id"])
|
||||
enrichedData = join(tables: {data: data, aux: auxData}, on: ["sensor_id"])
|
||||
|
||||
enrichedData
|
||||
|> yield(name: "enriched_data")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
_For an in-depth walkthrough of querying SQL data, see [Query SQL data sources](/enterprise_influxdb/v1.9/flux/guides/sql)._
|
||||
|
||||
---
|
||||
|
||||
### DatePart-like queries
|
||||
InfluxQL doesn't support DatePart-like queries that only return results during specified hours of the day.
|
||||
The Flux [`hourSelection` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/hourselection/)
|
||||
returns only data with time values in a specified hour range.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> hourSelection(start: 9, stop: 17)
|
||||
```
|
||||
|
||||
### Pivot
|
||||
Pivoting data tables has never been supported in InfluxQL.
|
||||
The Flux [`pivot()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/pivot) provides the ability
|
||||
to pivot data tables by specifying `rowKey`, `columnKey`, and `valueColumn` parameters.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> pivot(
|
||||
rowKey:["_time"],
|
||||
columnKey: ["_field"],
|
||||
valueColumn: "_value"
|
||||
)
|
||||
```
|
||||
|
||||
### Histograms
|
||||
The ability to generate histograms has been a highly requested feature for InfluxQL, but has never been supported.
|
||||
Flux's [`histogram()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/histogram) uses input
|
||||
data to generate a cumulative histogram with support for other histogram types coming in the future.
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used_percent"
|
||||
)
|
||||
|> histogram(
|
||||
buckets: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
_For an example of using Flux to create a cumulative histogram, see [Create histograms](/enterprise_influxdb/v1.9/flux/guides/histograms)._
|
||||
|
||||
---
|
||||
|
||||
### Covariance
|
||||
Flux provides functions for simple covariance calculation.
|
||||
The [`covariance()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/covariance)
|
||||
calculates the covariance between two columns and the [`cov()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/cov)
|
||||
calculates the covariance between two data streams.
|
||||
|
||||
###### Covariance between two columns
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start:-5m)
|
||||
|> covariance(columns: ["x", "y"])
|
||||
```
|
||||
|
||||
###### Covariance between two streams of data
|
||||
```js
|
||||
table1 = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "measurement_1"
|
||||
)
|
||||
|
||||
table2 = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "measurement_2"
|
||||
)
|
||||
|
||||
cov(x: table1, y: table2, on: ["_time", "_field"])
|
||||
```
|
||||
|
||||
### Cast booleans to integers
|
||||
InfluxQL supports type casting, but only for numeric data types (floats to integers and vice versa).
|
||||
[Flux type conversion functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/type-conversions/)
|
||||
provide much broader support for type conversions and let you perform some long-requested
|
||||
operations like casting a boolean values to integers.
|
||||
|
||||
##### Cast boolean field values to integers
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "m" and
|
||||
r._field == "bool_field"
|
||||
)
|
||||
|> toInt()
|
||||
```
|
||||
|
||||
### String manipulation and data shaping
|
||||
InfluxQL doesn't support string manipulation when querying data.
|
||||
The [Flux Strings package](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/strings/) is a collection of functions that operate on string data.
|
||||
When combined with the [`map()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map/),
|
||||
functions in the string package allow for operations like string sanitization and normalization.
|
||||
|
||||
```js
|
||||
import "strings"
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "weather" and
|
||||
r._field == "temp"
|
||||
)
|
||||
|> map(fn: (r) => ({
|
||||
r with
|
||||
location: strings.toTitle(v: r.location),
|
||||
sensor: strings.replaceAll(v: r.sensor, t: " ", u: "-"),
|
||||
status: strings.substring(v: r.status, start: 0, end: 8)
|
||||
}))
|
||||
```
|
||||
|
||||
### Work with geo-temporal data
|
||||
InfluxQL doesn't provide functionality for working with geo-temporal data.
|
||||
The [Flux Geo package](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/) is a collection of functions that
|
||||
let you shape, filter, and group geo-temporal data.
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
from(bucket: "geo/autogen")
|
||||
|> range(start: -1w)
|
||||
|> filter(fn: (r) => r._measurement == "taxi")
|
||||
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 20)
|
||||
|> geo.filterRows(
|
||||
region: {lat: 40.69335938, lon: -73.30078125, radius: 20.0},
|
||||
strict: true
|
||||
)
|
||||
|> geo.asTracks(groupBy: ["fare-id"])
|
||||
```
|
||||
|
||||
|
||||
## InfluxQL and Flux parity
|
||||
Flux is working towards complete parity with InfluxQL and new functions are being added to that end.
|
||||
The table below shows InfluxQL statements, clauses, and functions along with their equivalent Flux functions.
|
||||
|
||||
_For a complete list of Flux functions, [view all Flux functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/all-functions)._
|
||||
|
||||
### InfluxQL and Flux parity
|
||||
|
||||
| InfluxQL | Flux Functions |
|
||||
| -------- | -------------- |
|
||||
| [SELECT](/enterprise_influxdb/v1.9/query_language/explore-data/#the-basic-select-statement) | [filter()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/filter/) |
|
||||
| [WHERE](/enterprise_influxdb/v1.9/query_language/explore-data/#the-where-clause) | [filter()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/filter/), [range()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/range/) |
|
||||
| [GROUP BY](/enterprise_influxdb/v1.9/query_language/explore-data/#the-group-by-clause) | [group()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/group/) |
|
||||
| [INTO](/enterprise_influxdb/v1.9/query_language/explore-data/#the-into-clause) | [to()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/outputs/to/) <span><a style="color:orange" href="#footnote">*</a></span> |
|
||||
| [ORDER BY](/enterprise_influxdb/v1.9/query_language/explore-data/#order-by-time-desc) | [sort()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/sort/) |
|
||||
| [LIMIT](/enterprise_influxdb/v1.9/query_language/explore-data/#the-limit-clause) | [limit()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/limit/) |
|
||||
| [SLIMIT](/enterprise_influxdb/v1.9/query_language/explore-data/#the-slimit-clause) | -- |
|
||||
| [OFFSET](/enterprise_influxdb/v1.9/query_language/explore-data/#the-offset-clause) | -- |
|
||||
| [SOFFSET](/enterprise_influxdb/v1.9/query_language/explore-data/#the-soffset-clause) | -- |
|
||||
| [SHOW DATABASES](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-databases) | [buckets()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/inputs/buckets/) |
|
||||
| [SHOW MEASUREMENTS](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-measurements) | [v1.measurements](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/influxdb-v1/measurements) |
|
||||
| [SHOW FIELD KEYS](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-field-keys) | [keys()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/keys/) |
|
||||
| [SHOW RETENTION POLICIES](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-retention-policies) | [buckets()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/inputs/buckets/) |
|
||||
| [SHOW TAG KEYS](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-tag-keys) | [v1.tagKeys()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/influxdb-v1/tagkeys), [v1.measurementTagKeys()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/influxdb-v1/measurementtagkeys) |
|
||||
| [SHOW TAG VALUES](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-tag-values) | [v1.tagValues()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/influxdb-v1/tagvalues), [v1.measurementTagValues()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/influxdb-v1/measurementtagvalues) |
|
||||
| [SHOW SERIES](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-series) | -- |
|
||||
| [CREATE DATABASE](/enterprise_influxdb/v1.9/query_language/manage-database/#create-database) | -- |
|
||||
| [DROP DATABASE](/enterprise_influxdb/v1.9/query_language/manage-database/#delete-a-database-with-drop-database) | -- |
|
||||
| [DROP SERIES](/enterprise_influxdb/v1.9/query_language/manage-database/#drop-series-from-the-index-with-drop-series) | -- |
|
||||
| [DELETE](/enterprise_influxdb/v1.9/query_language/manage-database/#delete-series-with-delete) | -- |
|
||||
| [DROP MEASUREMENT](/enterprise_influxdb/v1.9/query_language/manage-database/#delete-measurements-with-drop-measurement) | -- |
|
||||
| [DROP SHARD](/enterprise_influxdb/v1.9/query_language/manage-database/#delete-a-shard-with-drop-shard) | -- |
|
||||
| [CREATE RETENTION POLICY](/enterprise_influxdb/v1.9/query_language/manage-database/#create-retention-policies-with-create-retention-policy) | -- |
|
||||
| [ALTER RETENTION POLICY](/enterprise_influxdb/v1.9/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy) | -- |
|
||||
| [DROP RETENTION POLICY](/enterprise_influxdb/v1.9/query_language/manage-database/#delete-retention-policies-with-drop-retention-policy) | -- |
|
||||
| [COUNT](/enterprise_influxdb/v1.9/query_language/functions#count) | [count()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/count/) |
|
||||
| [DISTINCT](/enterprise_influxdb/v1.9/query_language/functions#distinct) | [distinct()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/distinct/) |
|
||||
| [INTEGRAL](/enterprise_influxdb/v1.9/query_language/functions#integral) | [integral()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/integral/) |
|
||||
| [MEAN](/enterprise_influxdb/v1.9/query_language/functions#mean) | [mean()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/mean/) |
|
||||
| [MEDIAN](/enterprise_influxdb/v1.9/query_language/functions#median) | [median()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/median/) |
|
||||
| [MODE](/enterprise_influxdb/v1.9/query_language/functions#mode) | [mode()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/mode/) |
|
||||
| [SPREAD](/enterprise_influxdb/v1.9/query_language/functions#spread) | [spread()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/spread/) |
|
||||
| [STDDEV](/enterprise_influxdb/v1.9/query_language/functions#stddev) | [stddev()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/stddev/) |
|
||||
| [SUM](/enterprise_influxdb/v1.9/query_language/functions#sum) | [sum()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/sum/) |
|
||||
| [BOTTOM](/enterprise_influxdb/v1.9/query_language/functions#bottom) | [bottom()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/bottom/) |
|
||||
| [FIRST](/enterprise_influxdb/v1.9/query_language/functions#first) | [first()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/first/) |
|
||||
| [LAST](/enterprise_influxdb/v1.9/query_language/functions#last) | [last()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/last/) |
|
||||
| [MAX](/enterprise_influxdb/v1.9/query_language/functions#max) | [max()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/max/) |
|
||||
| [MIN](/enterprise_influxdb/v1.9/query_language/functions#min) | [min()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/min/) |
|
||||
| [PERCENTILE](/enterprise_influxdb/v1.9/query_language/functions#percentile) | [quantile()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/quantile/) |
|
||||
| [SAMPLE](/enterprise_influxdb/v1.9/query_language/functions#sample) | [sample()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/sample/) |
|
||||
| [TOP](/enterprise_influxdb/v1.9/query_language/functions#top) | [top()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/top/) |
|
||||
| [ABS](/enterprise_influxdb/v1.9/query_language/functions#abs) | [math.abs()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/abs/) |
|
||||
| [ACOS](/enterprise_influxdb/v1.9/query_language/functions#acos) | [math.acos()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/acos/) |
|
||||
| [ASIN](/enterprise_influxdb/v1.9/query_language/functions#asin) | [math.asin()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/asin/) |
|
||||
| [ATAN](/enterprise_influxdb/v1.9/query_language/functions#atan) | [math.atan()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/atan/) |
|
||||
| [ATAN2](/enterprise_influxdb/v1.9/query_language/functions#atan2) | [math.atan2()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/atan2/) |
|
||||
| [CEIL](/enterprise_influxdb/v1.9/query_language/functions#ceil) | [math.ceil()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/ceil/) |
|
||||
| [COS](/enterprise_influxdb/v1.9/query_language/functions#cos) | [math.cos()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/cos/) |
|
||||
| [CUMULATIVE_SUM](/enterprise_influxdb/v1.9/query_language/functions#cumulative-sum) | [cumulativeSum()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/cumulativesum/) |
|
||||
| [DERIVATIVE](/enterprise_influxdb/v1.9/query_language/functions#derivative) | [derivative()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/derivative/) |
|
||||
| [DIFFERENCE](/enterprise_influxdb/v1.9/query_language/functions#difference) | [difference()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/difference/) |
|
||||
| [ELAPSED](/enterprise_influxdb/v1.9/query_language/functions#elapsed) | [elapsed()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/elapsed/) |
|
||||
| [EXP](/enterprise_influxdb/v1.9/query_language/functions#exp) | [math.exp()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/exp/) |
|
||||
| [FLOOR](/enterprise_influxdb/v1.9/query_language/functions#floor) | [math.floor()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/floor/) |
|
||||
| [HISTOGRAM](/enterprise_influxdb/v1.9/query_language/functions#histogram) | [histogram()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/histogram/) |
|
||||
| [LN](/enterprise_influxdb/v1.9/query_language/functions#ln) | [math.log()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/log/) |
|
||||
| [LOG](/enterprise_influxdb/v1.9/query_language/functions#log) | [math.logb()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/logb/) |
|
||||
| [LOG2](/enterprise_influxdb/v1.9/query_language/functions#log2) | [math.log2()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/log2/) |
|
||||
| [LOG10](/enterprise_influxdb/v1.9/query_language/functions/#log10) | [math.log10()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/log10/) |
|
||||
| [MOVING_AVERAGE](/enterprise_influxdb/v1.9/query_language/functions#moving-average) | [movingAverage()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/movingaverage/) |
|
||||
| [NON_NEGATIVE_DERIVATIVE](/enterprise_influxdb/v1.9/query_language/functions#non-negative-derivative) | [derivative(nonNegative:true)](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/derivative/) |
|
||||
| [NON_NEGATIVE_DIFFERENCE](/enterprise_influxdb/v1.9/query_language/functions#non-negative-difference) | [difference(nonNegative:true)](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/derivative/) |
|
||||
| [POW](/enterprise_influxdb/v1.9/query_language/functions#pow) | [math.pow()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/pow/) |
|
||||
| [ROUND](/enterprise_influxdb/v1.9/query_language/functions#round) | [math.round()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/round/) |
|
||||
| [SIN](/enterprise_influxdb/v1.9/query_language/functions#sin) | [math.sin()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/sin/) |
|
||||
| [SQRT](/enterprise_influxdb/v1.9/query_language/functions#sqrt) | [math.sqrt()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/sqrt/) |
|
||||
| [TAN](/enterprise_influxdb/v1.9/query_language/functions#tan) | [math.tan()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/math/tan/) |
|
||||
| [HOLT_WINTERS](/enterprise_influxdb/v1.9/query_language/functions#holt-winters) | [holtWinters()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/holtwinters/) |
|
||||
| [CHANDE_MOMENTUM_OSCILLATOR](/enterprise_influxdb/v1.9/query_language/functions#chande-momentum-oscillator) | [chandeMomentumOscillator()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/chandemomentumoscillator/) |
|
||||
| [EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.9/query_language/functions#exponential-moving-average) | [exponentialMovingAverage()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/exponentialmovingaverage/) |
|
||||
| [DOUBLE_EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.9/query_language/functions#double-exponential-moving-average) | [doubleEMA()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/doubleema/) |
|
||||
| [KAUFMANS_EFFICIENCY_RATIO](/enterprise_influxdb/v1.9/query_language/functions#kaufmans-efficiency-ratio) | [kaufmansER()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/kaufmanser/) |
|
||||
| [KAUFMANS_ADAPTIVE_MOVING_AVERAGE](/enterprise_influxdb/v1.9/query_language/functions#kaufmans-adaptive-moving-average) | [kaufmansAMA()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/kaufmansama/) |
|
||||
| [TRIPLE_EXPONENTIAL_MOVING_AVERAGE](/enterprise_influxdb/v1.9/query_language/functions#triple-exponential-moving-average) | [tripleEMA()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/tripleema/) |
|
||||
| [TRIPLE_EXPONENTIAL_DERIVATIVE](/enterprise_influxdb/v1.9/query_language/functions#triple-exponential-derivative) | [tripleExponentialDerivative()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/tripleexponentialderivative/) |
|
||||
| [RELATIVE_STRENGTH_INDEX](/enterprise_influxdb/v1.9/query_language/functions#relative-strength-index) | [relativeStrengthIndex()](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/relativestrengthindex/) |
|
||||
|
||||
_<span style="font-size:.9rem" id="footnote"><span style="color:orange">*</span> The <code>to()</code> function only writes to InfluxDB 2.0.</span>_
|
|
@ -0,0 +1,115 @@
|
|||
---
|
||||
title: Get started with Flux
|
||||
description: >
|
||||
Get started with Flux, InfluxData's new functional data scripting language.
|
||||
This step-by-step guide will walk you through the basics and get you on your way.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Get started with Flux
|
||||
identifier: get-started
|
||||
parent: Flux
|
||||
weight: 2
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/flux/getting-started/
|
||||
- /enterprise_influxdb/v1.9/flux/introduction/getting-started/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/
|
||||
v2: /influxdb/v2.0/query-data/get-started/
|
||||
---
|
||||
|
||||
Flux is InfluxData's new functional data scripting language designed for querying,
|
||||
analyzing, and acting on data.
|
||||
|
||||
This multi-part getting started guide walks through important concepts related to Flux.
|
||||
It covers querying time series data from InfluxDB using Flux, and introduces Flux syntax and functions.
|
||||
|
||||
## What you will need
|
||||
|
||||
##### InfluxDB v1.8+
|
||||
Flux v0.65 is built into InfluxDB v1.8 and can be used to query data stored in InfluxDB.
|
||||
|
||||
---
|
||||
|
||||
_For information about downloading and installing InfluxDB, see [InfluxDB installation](/enterprise_influxdb/v1.9/introduction/installation)._
|
||||
|
||||
---
|
||||
|
||||
##### Chronograf v1.8+
|
||||
**Not required but strongly recommended**.
|
||||
Chronograf v1.8's Data Explorer provides a user interface (UI) for writing Flux scripts and visualizing results.
|
||||
Dashboards in Chronograf v1.8+ also support Flux queries.
|
||||
|
||||
---
|
||||
|
||||
_For information about downloading and installing Chronograf, see [Chronograf installation](/{{< latest "chronograf" >}}/introduction/installation)._
|
||||
|
||||
---
|
||||
|
||||
## Key concepts
|
||||
Flux introduces important new concepts you should understand as you get started.
|
||||
|
||||
### Buckets
|
||||
Flux introduces "buckets," a new data storage concept for InfluxDB.
|
||||
A **bucket** is a named location where data is stored that has a retention policy.
|
||||
It's similar to an InfluxDB v1.x "database," but is a combination of both a database and a retention policy.
|
||||
When using multiple retention policies, each retention policy is treated as is its own bucket.
|
||||
|
||||
Flux's [`from()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/inputs/from), which defines an InfluxDB data source, requires a `bucket` parameter.
|
||||
When using Flux with InfluxDB v1.x, use the following bucket naming convention which combines
|
||||
the database name and retention policy into a single bucket name:
|
||||
|
||||
###### InfluxDB v1.x bucket naming convention
|
||||
```js
|
||||
// Pattern
|
||||
from(bucket:"<database>/<retention-policy>")
|
||||
|
||||
// Example
|
||||
from(bucket:"telegraf/autogen")
|
||||
```
|
||||
|
||||
### Pipe-forward operator
|
||||
Flux uses pipe-forward operators (`|>`) extensively to chain operations together.
|
||||
After each function or operation, Flux returns a table or collection of tables containing data.
|
||||
The pipe-forward operator pipes those tables into the next function or operation where
|
||||
they are further processed or manipulated.
|
||||
|
||||
### Tables
|
||||
Flux structures all data in tables.
|
||||
When data is streamed from data sources, Flux formats it as annotated comma-separated values (CSV), representing tables.
|
||||
Functions then manipulate or process them and output new tables.
|
||||
This makes it easy to chain together functions to build sophisticated queries.
|
||||
|
||||
#### Group keys
|
||||
Every table has a **group key** which describes the contents of the table.
|
||||
It's a list of columns for which every row in the table will have the same value.
|
||||
Columns with unique values in each row are **not** part of the group key.
|
||||
|
||||
As functions process and transform data, each modifies the group keys of output tables.
|
||||
Understanding how tables and group keys are modified by functions is key to properly
|
||||
shaping your data for the desired output.
|
||||
|
||||
###### Example group key
|
||||
```js
|
||||
[_start, _stop, _field, _measurement, host]
|
||||
```
|
||||
|
||||
Note that `_time` and `_value` are excluded from the example group key because they
|
||||
are unique to each row.
|
||||
|
||||
## Tools for working with Flux
|
||||
|
||||
You have multiple [options for writing and running Flux queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/),
|
||||
but as you're getting started, we recommend using the following:
|
||||
|
||||
### Chronograf's Data Explorer
|
||||
Chronograf's Data Explorer makes it easy to write your first Flux script and visualize the results.
|
||||
To use Chronograf's Flux UI, open the **Data Explorer** and to the right of the source
|
||||
dropdown above the graph placeholder, select **Flux** as the source type.
|
||||
|
||||
This will provide **Schema**, **Script**, and **Functions** panes.
|
||||
The Schema pane allows you to explore your data.
|
||||
The Script pane is where you write your Flux script.
|
||||
The Functions pane provides a list of functions available in your Flux queries.
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn next" href="/enterprise_influxdb/v1.9/flux/get-started/query-influxdb/">Query InfluxDB with Flux</a>
|
||||
</div>
|
|
@ -0,0 +1,134 @@
|
|||
---
|
||||
title: Query InfluxDB with Flux
|
||||
description: Learn the basics of using Flux to query data from InfluxDB.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Query InfluxDB
|
||||
parent: get-started
|
||||
weight: 1
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/flux/getting-started/query-influxdb/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/query-influxdb/
|
||||
v2: /influxdb/v2.0/query-data/get-started/query-influxdb/
|
||||
---
|
||||
|
||||
This guide walks through the basics of using Flux to query data from InfluxDB.
|
||||
_**If you haven't already, make sure to install InfluxDB v1.8+, [enable Flux](/enterprise_influxdb/v1.9/flux/installation),
|
||||
and choose a [tool for writing Flux queries](/enterprise_influxdb/v1.9/flux/get-started#tools-for-working-with-flux).**_
|
||||
|
||||
Every Flux query needs the following:
|
||||
|
||||
1. [A data source](#1-define-your-data-source)
|
||||
2. [A time range](#2-specify-a-time-range)
|
||||
3. [Data filters](#3-filter-your-data)
|
||||
|
||||
|
||||
## 1. Define your data source
|
||||
Flux's [`from()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/inputs/from) function defines an InfluxDB data source.
|
||||
It requires a [`bucket`](/enterprise_influxdb/v1.9/flux/get-started/#buckets) parameter.
|
||||
For this example, use `telegraf/autogen`, a combination of the default database and retention policy provided by the TICK stack.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
```
|
||||
|
||||
## 2. Specify a time range
|
||||
Flux requires a time range when querying time series data.
|
||||
"Unbounded" queries are very resource-intensive and as a protective measure,
|
||||
Flux will not query the database without a specified range.
|
||||
|
||||
Use the pipe-forward operator (`|>`) to pipe data from your data source into the [`range()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/range)
|
||||
function, which specifies a time range for your query.
|
||||
It accepts two properties: `start` and `stop`.
|
||||
Ranges can be **relative** using negative [durations](/{{< latest "influxdb" "v2" >}}/reference/flux/language/lexical-elements#duration-literals)
|
||||
or **absolute** using [timestamps](/{{< latest "influxdb" "v2" >}}/reference/flux/language/lexical-elements#date-and-time-literals).
|
||||
|
||||
###### Example relative time ranges
|
||||
```js
|
||||
// Relative time range with start only. Stop defaults to now.
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|
||||
// Relative time range with start and stop
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h, stop: -10m)
|
||||
```
|
||||
|
||||
> Relative ranges are relative to "now."
|
||||
|
||||
###### Example absolute time range
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: 2018-11-05T23:30:00Z, stop: 2018-11-06T00:00:00Z)
|
||||
```
|
||||
|
||||
#### Use the following:
|
||||
For this guide, use the relative time range, `-15m`, to limit query results to data from the last 15 minutes:
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
```
|
||||
|
||||
## 3. Filter your data
|
||||
Pass your ranged data into the `filter()` function to narrow results based on data attributes or columns.
|
||||
The `filter()` function has one parameter, `fn`, which expects an anonymous function
|
||||
with logic that filters data based on columns or attributes.
|
||||
|
||||
Flux's anonymous function syntax is very similar to Javascript's.
|
||||
Records or rows are passed into the `filter()` function as a record (`r`).
|
||||
The anonymous function takes the record and evaluates it to see if it matches the defined filters.
|
||||
Use the `AND` relational operator to chain multiple filters.
|
||||
|
||||
```js
|
||||
// Pattern
|
||||
(r) => (r.recordProperty comparisonOperator comparisonExpression)
|
||||
|
||||
// Example with single filter
|
||||
(r) => (r._measurement == "cpu")
|
||||
|
||||
// Example with multiple filters
|
||||
(r) => (r._measurement == "cpu") and (r._field != "usage_system" )
|
||||
```
|
||||
|
||||
#### Use the following:
|
||||
For this example, filter by the `cpu` measurement, the `usage_system` field, and the `cpu-total` tag value:
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
```
|
||||
|
||||
## 4. Yield your queried data
|
||||
Use Flux's `yield()` function to output the filtered tables as the result of the query.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> yield()
|
||||
```
|
||||
|
||||
> Chronograf and the `influx` CLI automatically assume a `yield()` function at
|
||||
> the end of each script in order to output and visualize the data.
|
||||
> Best practice is to include a `yield()` function, but it is not always necessary.
|
||||
|
||||
## Congratulations!
|
||||
You have now queried data from InfluxDB using Flux.
|
||||
|
||||
The query shown here is a barebones example.
|
||||
Flux queries can be extended in many ways to form powerful scripts.
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn prev" href="/enterprise_influxdb/v1.9/flux/get-started/">Get started with Flux</a>
|
||||
<a class="btn next" href="/enterprise_influxdb/v1.9/flux/get-started/transform-data/">Transform your data</a>
|
||||
</div>
|
|
@ -0,0 +1,215 @@
|
|||
---
|
||||
title: Flux syntax basics
|
||||
description: An introduction to the basic elements of the Flux syntax with real-world application examples.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Syntax basics
|
||||
parent: get-started
|
||||
weight: 3
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/flux/getting-started/syntax-basics/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/syntax-basics/
|
||||
v2: /influxdb/v2.0/query-data/get-started/syntax-basics/
|
||||
---
|
||||
|
||||
|
||||
Flux, at its core, is a scripting language designed specifically for working with data.
|
||||
This guide walks through a handful of simple expressions and how they are handled in Flux.
|
||||
|
||||
### Simple expressions
|
||||
Flux is a scripting language that supports basic expressions.
|
||||
For example, simple addition:
|
||||
|
||||
```js
|
||||
> 1 + 1
|
||||
2
|
||||
```
|
||||
|
||||
### Variables
|
||||
Assign an expression to a variable using the assignment operator, `=`.
|
||||
|
||||
```js
|
||||
> s = "this is a string"
|
||||
> i = 1 // an integer
|
||||
> f = 2.0 // a floating point number
|
||||
```
|
||||
|
||||
Type the name of a variable to print its value:
|
||||
|
||||
```js
|
||||
> s
|
||||
this is a string
|
||||
> i
|
||||
1
|
||||
> f
|
||||
2
|
||||
```
|
||||
|
||||
### Records
|
||||
Flux also supports records. Each value in a record can be a different data type.
|
||||
|
||||
```js
|
||||
> o = {name:"Jim", age: 42, "favorite color": "red"}
|
||||
```
|
||||
|
||||
Use **dot notation** to access a properties of a record:
|
||||
|
||||
```js
|
||||
> o.name
|
||||
Jim
|
||||
> o.age
|
||||
42
|
||||
```
|
||||
|
||||
Or **bracket notation**:
|
||||
|
||||
```js
|
||||
> o["name"]
|
||||
Jim
|
||||
> o["age"]
|
||||
42
|
||||
> o["favorite color"]
|
||||
red
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Use bracket notation to reference record properties with special or
|
||||
white space characters in the property key.
|
||||
{{% /note %}}
|
||||
|
||||
### Lists
|
||||
Flux supports lists. List values must be the same type.
|
||||
|
||||
```js
|
||||
> n = 4
|
||||
> l = [1,2,3,n]
|
||||
> l
|
||||
[1, 2, 3, 4]
|
||||
```
|
||||
|
||||
### Functions
|
||||
Flux uses functions for most of its heavy lifting.
|
||||
Below is a simple function that squares a number, `n`.
|
||||
|
||||
```js
|
||||
> square = (n) => n * n
|
||||
> square(n:3)
|
||||
9
|
||||
```
|
||||
|
||||
> Flux does not support positional arguments or parameters.
|
||||
> Parameters must always be named when calling a function.
|
||||
|
||||
### Pipe-forward operator
|
||||
Flux uses the pipe-forward operator (`|>`) extensively to chain operations together.
|
||||
After each function or operation, Flux returns a table or collection of tables containing data.
|
||||
The pipe-forward operator pipes those tables into the next function where they are further processed or manipulated.
|
||||
|
||||
```js
|
||||
data |> someFunction() |> anotherFunction()
|
||||
```
|
||||
|
||||
## Real-world application of basic syntax
|
||||
This likely seems familiar if you've already been through through the other [getting started guides](/enterprise_influxdb/v1.9/flux/get-started).
|
||||
Flux's syntax is inspired by Javascript and other functional scripting languages.
|
||||
As you begin to apply these basic principles in real-world use cases such as creating data stream variables,
|
||||
custom functions, etc., the power of Flux and its ability to query and process data will become apparent.
|
||||
|
||||
The examples below provide both multi-line and single-line versions of each input command.
|
||||
Carriage returns in Flux aren't necessary, but do help with readability.
|
||||
Both single- and multi-line commands can be copied and pasted into the `influx` CLI running in Flux mode.
|
||||
|
||||
{{< tabs-wrapper >}}
|
||||
{{% tabs %}}
|
||||
[Multi-line inputs](#)
|
||||
[Single-line inputs](#)
|
||||
{{% /tabs %}}
|
||||
{{% tab-content %}}
|
||||
### Define data stream variables
|
||||
A common use case for variable assignments in Flux is creating variables for one
|
||||
or more input data streams.
|
||||
|
||||
```js
|
||||
timeRange = -1h
|
||||
|
||||
cpuUsageUser =
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: timeRange)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_user" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|
||||
memUsagePercent =
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: timeRange)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used_percent"
|
||||
)
|
||||
```
|
||||
|
||||
These variables can be used in other functions, such as `join()`, while keeping the syntax minimal and flexible.
|
||||
|
||||
### Define custom functions
|
||||
Create a function that returns the `N` number rows in the input stream with the highest `_value`s.
|
||||
To do this, pass the input stream (`tables`) and the number of results to return (`n`) into a custom function.
|
||||
Then using Flux's `sort()` and `limit()` functions to find the top `n` results in the data set.
|
||||
|
||||
```js
|
||||
topN = (tables=<-, n) =>
|
||||
tables
|
||||
|> sort(desc: true)
|
||||
|> limit(n: n)
|
||||
```
|
||||
|
||||
_More information about creating custom functions is available in the [Custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) documentation._
|
||||
|
||||
Using this new custom function `topN` and the `cpuUsageUser` data stream variable defined above,
|
||||
find the top five data points and yield the results.
|
||||
|
||||
```js
|
||||
cpuUsageUser
|
||||
|> topN(n:5)
|
||||
|> yield()
|
||||
```
|
||||
{{% /tab-content %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
### Define data stream variables
|
||||
A common use case for variable assignments in Flux is creating variables for multiple filtered input data streams.
|
||||
|
||||
```js
|
||||
timeRange = -1h
|
||||
cpuUsageUser = from(bucket:"telegraf/autogen") |> range(start: timeRange) |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_user" and r.cpu == "cpu-total")
|
||||
memUsagePercent = from(bucket:"telegraf/autogen") |> range(start: timeRange) |> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent")
|
||||
```
|
||||
|
||||
These variables can be used in other functions, such as `join()`, while keeping the syntax minimal and flexible.
|
||||
|
||||
### Define custom functions
|
||||
Let's create a function that returns the `N` number rows in the input data stream with the highest `_value`s.
|
||||
To do this, pass the input stream (`tables`) and the number of results to return (`n`) into a custom function.
|
||||
Then using Flux's `sort()` and `limit()` functions to find the top `n` results in the data set.
|
||||
|
||||
```js
|
||||
topN = (tables=<-, n) => tables |> sort(desc: true) |> limit(n: n)
|
||||
```
|
||||
|
||||
_More information about creating custom functions is available in the [Custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) documentation._
|
||||
|
||||
Using the `cpuUsageUser` data stream variable defined [above](#define-data-stream-variables),
|
||||
find the top five data points with the custom `topN` function and yield the results.
|
||||
|
||||
```js
|
||||
cpuUsageUser |> topN(n:5) |> yield()
|
||||
```
|
||||
{{% /tab-content %}}
|
||||
{{< /tabs-wrapper >}}
|
||||
|
||||
This query will return the five data points with the highest user CPU usage over the last hour.
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn prev" href="/enterprise_influxdb/v1.9/flux/get-started/transform-data/">Transform your data</a>
|
||||
</div>
|
|
@ -0,0 +1,184 @@
|
|||
---
|
||||
title: Transform data with Flux
|
||||
description: Learn the basics of using Flux to transform data queried from InfluxDB.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Transform your data
|
||||
parent: get-started
|
||||
weight: 2
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/flux/getting-started/transform-data/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/get-started/transform-data/
|
||||
v2: /influxdb/v2.0/query-data/get-started/transform-data/
|
||||
---
|
||||
|
||||
When [querying data from InfluxDB](/enterprise_influxdb/v1.9/flux/get-started/query-influxdb),
|
||||
you often need to transform that data in some way.
|
||||
Common examples are aggregating data into averages, downsampling data, etc.
|
||||
|
||||
This guide demonstrates using [Flux functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib) to transform your data.
|
||||
It walks through creating a Flux script that partitions data into windows of time,
|
||||
averages the `_value`s in each window, and outputs the averages as a new table.
|
||||
|
||||
It's important to understand how the "shape" of your data changes through each of these operations.
|
||||
|
||||
## Query data
|
||||
Use the query built in the previous [Query data from InfluxDB](/enterprise_influxdb/v1.9/flux/get-started/query-influxdb)
|
||||
guide, but update the range to pull data from the last hour:
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
```
|
||||
|
||||
## Flux functions
|
||||
Flux provides a number of functions that perform specific operations, transformations, and tasks.
|
||||
You can also [create custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) in your Flux queries.
|
||||
_Functions are covered in detail in the [Flux functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib) documentation._
|
||||
|
||||
A common type of function used when transforming data queried from InfluxDB is an aggregate function.
|
||||
Aggregate functions take a set of `_value`s in a table, aggregate them, and transform
|
||||
them into a new value.
|
||||
|
||||
This example uses the [`mean()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/mean)
|
||||
to average values within time windows.
|
||||
|
||||
> The following example walks through the steps required to window and aggregate data,
|
||||
> but there is a [`aggregateWindow()` helper function](#helper-functions) that does it for you.
|
||||
> It's just good to understand the steps in the process.
|
||||
|
||||
## Window your data
|
||||
Flux's [`window()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/window) partitions records based on a time value.
|
||||
Use the `every` parameter to define a duration of time for each window.
|
||||
|
||||
{{% note %}}
|
||||
#### Calendar months and years
|
||||
`every` supports all [valid duration units](/{{< latest "influxdb" "v2" >}}/reference/flux/language/types/#duration-types),
|
||||
including **calendar months (`1mo`)** and **years (`1y`)**.
|
||||
{{% /note %}}
|
||||
|
||||
For this example, window data in five minute intervals (`5m`).
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> window(every: 5m)
|
||||
```
|
||||
|
||||
As data is gathered into windows of time, each window is output as its own table.
|
||||
When visualized, each table is assigned a unique color.
|
||||
|
||||

|
||||
|
||||
## Aggregate windowed data
|
||||
Flux aggregate functions take the `_value`s in each table and aggregate them in some way.
|
||||
Use the [`mean()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/mean) to average the `_value`s of each table.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> window(every: 5m)
|
||||
|> mean()
|
||||
```
|
||||
|
||||
As rows in each window are aggregated, their output table contains only a single row with the aggregate value.
|
||||
Windowed tables are all still separate and, when visualized, will appear as single, unconnected points.
|
||||
|
||||

|
||||
|
||||
## Add times to your aggregates
|
||||
As values are aggregated, the resulting tables do not have a `_time` column because
|
||||
the records used for the aggregation all have different timestamps.
|
||||
Aggregate functions don't infer what time should be used for the aggregate value.
|
||||
Therefore the `_time` column is dropped.
|
||||
|
||||
A `_time` column is required in the [next operation](#unwindow-aggregate-tables).
|
||||
To add one, use the [`duplicate()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/duplicate)
|
||||
to duplicate the `_stop` column as the `_time` column for each windowed table.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> window(every: 5m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
```
|
||||
|
||||
## Unwindow aggregate tables
|
||||
|
||||
Use the `window()` function with the `every: inf` parameter to gather all points
|
||||
into a single, infinite window.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> window(every: 5m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
|> window(every: inf)
|
||||
```
|
||||
|
||||
Once ungrouped and combined into a single table, the aggregate data points will appear connected in your visualization.
|
||||
|
||||

|
||||
|
||||
## Helper functions
|
||||
This may seem like a lot of coding just to build a query that aggregates data, however going through the
|
||||
process helps to understand how data changes "shape" as it is passed through each function.
|
||||
|
||||
Flux provides (and allows you to create) "helper" functions that abstract many of these steps.
|
||||
The same operation performed in this guide can be accomplished using the
|
||||
[`aggregateWindow()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow).
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_system" and
|
||||
r.cpu == "cpu-total"
|
||||
)
|
||||
|> aggregateWindow(every: 5m, fn: mean)
|
||||
```
|
||||
|
||||
## Congratulations!
|
||||
You have now constructed a Flux query that uses Flux functions to transform your data.
|
||||
There are many more ways to manipulate your data using both Flux's primitive functions
|
||||
and your own custom functions, but this is a good introduction into the basic syntax and query structure.
|
||||
|
||||
---
|
||||
|
||||
_For a deeper dive into windowing and aggregating data with example data output for each transformation,
|
||||
view the [Windowing and aggregating data](/enterprise_influxdb/v1.9/flux/guides/window-aggregate) guide._
|
||||
|
||||
---
|
||||
|
||||
<div class="page-nav-btns">
|
||||
<a class="btn prev" href="/enterprise_influxdb/v1.9/flux/get-started/query-influxdb/">Query InfluxDB</a>
|
||||
<a class="btn next" href="/enterprise_influxdb/v1.9/flux/get-started/syntax-basics/">Syntax basics</a>
|
||||
</div>
|
|
@ -0,0 +1,40 @@
|
|||
---
|
||||
title: Query data with Flux
|
||||
description: Guides that walk through both common and complex queries and use cases for Flux.
|
||||
weight: 3
|
||||
aliases:
|
||||
- /flux/latest/
|
||||
- /flux/latest/introduction
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Query with Flux
|
||||
parent: Flux
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/
|
||||
v2: /influxdb/v2.0/query-data/flux/
|
||||
---
|
||||
|
||||
The following guides walk through both common and complex queries and use cases for Flux.
|
||||
|
||||
{{% note %}}
|
||||
#### Example data variable
|
||||
Many of the examples provided in the following guides use a `data` variable,
|
||||
which represents a basic query that filters data by measurement and field.
|
||||
`data` is defined as:
|
||||
|
||||
```js
|
||||
data = from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "example-measurement" and
|
||||
r._field == "example-field"
|
||||
)
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
## Flux query guides
|
||||
|
||||
{{< children type="anchored-list" pages="all" >}}
|
||||
|
||||
---
|
||||
|
||||
{{< children pages="all" readmore="true" hr="true" >}}
|
|
@ -0,0 +1,210 @@
|
|||
---
|
||||
title: Calculate percentages with Flux
|
||||
list_title: Calculate percentages
|
||||
description: >
|
||||
Use `pivot()` or `join()` and the `map()` function to align operand values into rows and calculate a percentage.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Calculate percentages
|
||||
identifier: flux-calc-perc
|
||||
parent: Query with Flux
|
||||
weight: 6
|
||||
list_query_example: percentages
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/calculate-percentages/
|
||||
v2: /influxdb/v2.0/query-data/flux/calculate-percentages/
|
||||
---
|
||||
|
||||
Calculating percentages from queried data is a common use case for time series data.
|
||||
To calculate a percentage in Flux, operands must be in each row.
|
||||
Use `map()` to re-map values in the row and calculate a percentage.
|
||||
|
||||
**To calculate percentages**
|
||||
|
||||
1. Use [`from()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/inputs/from/),
|
||||
[`range()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/range/) and
|
||||
[`filter()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/filter/) to query operands.
|
||||
2. Use [`pivot()` or `join()`](/enterprise_influxdb/v1.9/flux/guides/mathematic-operations/#pivot-vs-join)
|
||||
to align operand values into rows.
|
||||
3. Use [`map()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map/)
|
||||
to divide the numerator operand value by the denominator operand value and multiply by 100.
|
||||
|
||||
{{% note %}}
|
||||
The following examples use `pivot()` to align operands into rows because
|
||||
`pivot()` works in most cases and is more performant than `join()`.
|
||||
_See [Pivot vs join](/enterprise_influxdb/v1.9/flux/guides/mathematic-operations/#pivot-vs-join)._
|
||||
{{% /note %}}
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "m1" and r._field =~ /field[1-2]/ )
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({ r with _value: r.field1 / r.field2 * 100.0 }))
|
||||
```
|
||||
|
||||
## GPU monitoring example
|
||||
The following example queries data from the gpu-monitor bucket and calculates the
|
||||
percentage of GPU memory used over time.
|
||||
Data includes the following:
|
||||
|
||||
- **`gpu` measurement**
|
||||
- **`mem_used` field**: used GPU memory in bytes
|
||||
- **`mem_total` field**: total GPU memory in bytes
|
||||
|
||||
### Query mem_used and mem_total fields
|
||||
```js
|
||||
from(bucket: "gpu-monitor")
|
||||
|> range(start: 2020-01-01T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "gpu" and r._field =~ /mem_/)
|
||||
```
|
||||
|
||||
###### Returns the following stream of tables:
|
||||
|
||||
| _time | _measurement | _field | _value |
|
||||
|:----- |:------------:|:------: | ------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | mem_used | 2517924577 |
|
||||
| 2020-01-01T00:00:10Z | gpu | mem_used | 2695091978 |
|
||||
| 2020-01-01T00:00:20Z | gpu | mem_used | 2576980377 |
|
||||
| 2020-01-01T00:00:30Z | gpu | mem_used | 3006477107 |
|
||||
| 2020-01-01T00:00:40Z | gpu | mem_used | 3543348019 |
|
||||
| 2020-01-01T00:00:50Z | gpu | mem_used | 4402341478 |
|
||||
|
||||
<p style="margin:-2.5rem 0;"></p>
|
||||
|
||||
| _time | _measurement | _field | _value |
|
||||
|:----- |:------------:|:------: | ------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:10Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:20Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:30Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:40Z | gpu | mem_total | 8589934592 |
|
||||
| 2020-01-01T00:00:50Z | gpu | mem_total | 8589934592 |
|
||||
|
||||
### Pivot fields into columns
|
||||
Use `pivot()` to pivot the `mem_used` and `mem_total` fields into columns.
|
||||
Output includes `mem_used` and `mem_total` columns with values for each corresponding `_time`.
|
||||
|
||||
```js
|
||||
// ...
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
```
|
||||
|
||||
###### Returns the following:
|
||||
|
||||
| _time | _measurement | mem_used | mem_total |
|
||||
|:----- |:------------:| --------: | ---------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | 2517924577 | 8589934592 |
|
||||
| 2020-01-01T00:00:10Z | gpu | 2695091978 | 8589934592 |
|
||||
| 2020-01-01T00:00:20Z | gpu | 2576980377 | 8589934592 |
|
||||
| 2020-01-01T00:00:30Z | gpu | 3006477107 | 8589934592 |
|
||||
| 2020-01-01T00:00:40Z | gpu | 3543348019 | 8589934592 |
|
||||
| 2020-01-01T00:00:50Z | gpu | 4402341478 | 8589934592 |
|
||||
|
||||
### Map new values
|
||||
Each row now contains the values necessary to calculate a percentage.
|
||||
Use `map()` to re-map values in each row.
|
||||
Divide `mem_used` by `mem_total` and multiply by 100 to return the percentage.
|
||||
|
||||
{{% note %}}
|
||||
To return a precise float percentage value that includes decimal points, the example
|
||||
below casts integer field values to floats and multiplies by a float value (`100.0`).
|
||||
{{% /note %}}
|
||||
|
||||
```js
|
||||
// ...
|
||||
|> map(fn: (r) => ({
|
||||
_time: r._time,
|
||||
_measurement: r._measurement,
|
||||
_field: "mem_used_percent",
|
||||
_value: float(v: r.mem_used) / float(v: r.mem_total) * 100.0
|
||||
}))
|
||||
```
|
||||
##### Query results:
|
||||
|
||||
| _time | _measurement | _field | _value |
|
||||
|:----- |:------------:|:------: | ------: |
|
||||
| 2020-01-01T00:00:00Z | gpu | mem_used_percent | 29.31 |
|
||||
| 2020-01-01T00:00:10Z | gpu | mem_used_percent | 31.37 |
|
||||
| 2020-01-01T00:00:20Z | gpu | mem_used_percent | 30.00 |
|
||||
| 2020-01-01T00:00:30Z | gpu | mem_used_percent | 35.00 |
|
||||
| 2020-01-01T00:00:40Z | gpu | mem_used_percent | 41.25 |
|
||||
| 2020-01-01T00:00:50Z | gpu | mem_used_percent | 51.25 |
|
||||
|
||||
### Full query
|
||||
```js
|
||||
from(bucket: "gpu-monitor")
|
||||
|> range(start: 2020-01-01T00:00:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "gpu" and r._field =~ /mem_/ )
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({
|
||||
_time: r._time,
|
||||
_measurement: r._measurement,
|
||||
_field: "mem_used_percent",
|
||||
_value: float(v: r.mem_used) / float(v: r.mem_total) * 100.0
|
||||
}))
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
#### Calculate percentages using multiple fields
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement")
|
||||
|> filter(fn: (r) =>
|
||||
r._field == "used_system" or
|
||||
r._field == "used_user" or
|
||||
r._field == "total"
|
||||
)
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({ r with
|
||||
_value: float(v: r.used_system + r.used_user) / float(v: r.total) * 100.0
|
||||
}))
|
||||
```
|
||||
|
||||
#### Calculate percentages using multiple measurements
|
||||
|
||||
1. Ensure measurements are in the same [bucket](/enterprise_influxdb/v1.9/flux/get-started/#buckets).
|
||||
2. Use `filter()` to include data from both measurements.
|
||||
3. Use `group()` to ungroup data and return a single table.
|
||||
4. Use `pivot()` to pivot fields into columns.
|
||||
5. Use `map()` to re-map rows and perform the percentage calculation.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
(r._measurement == "m1" or r._measurement == "m2") and
|
||||
(r._field == "field1" or r._field == "field2")
|
||||
)
|
||||
|> group()
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({ r with _value: r.field1 / r.field2 * 100.0 }))
|
||||
```
|
||||
|
||||
#### Calculate percentages using multiple data sources
|
||||
```js
|
||||
import "sql"
|
||||
import "influxdata/influxdb/secrets"
|
||||
|
||||
pgUser = secrets.get(key: "POSTGRES_USER")
|
||||
pgPass = secrets.get(key: "POSTGRES_PASSWORD")
|
||||
pgHost = secrets.get(key: "POSTGRES_HOST")
|
||||
|
||||
t1 = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://${pgUser}:${pgPass}@${pgHost}",
|
||||
query:"SELECT id, name, available FROM exampleTable"
|
||||
)
|
||||
|
||||
t2 = from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "example-measurement" and
|
||||
r._field == "example-field"
|
||||
)
|
||||
|
||||
join(tables: {t1: t1, t2: t2}, on: ["id"])
|
||||
|> map(fn: (r) => ({ r with _value: r._value_t2 / r.available_t1 * 100.0 }))
|
||||
```
|
|
@ -0,0 +1,197 @@
|
|||
---
|
||||
title: Query using conditional logic
|
||||
seotitle: Query using conditional logic in Flux
|
||||
list_title: Conditional logic
|
||||
description: >
|
||||
This guide describes how to use Flux conditional expressions, such as `if`,
|
||||
`else`, and `then`, to query and transform data. **Flux evaluates statements from left to right and stops evaluating once a condition matches.**
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Conditional logic
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/conditional-logic/
|
||||
v2: /influxdb/v2.0/query-data/flux/conditional-logic/
|
||||
list_code_example: |
|
||||
```js
|
||||
if color == "green" then "008000" else "ffffff"
|
||||
```
|
||||
---
|
||||
|
||||
Flux provides `if`, `then`, and `else` conditional expressions that allow for powerful and flexible Flux queries.
|
||||
|
||||
##### Conditional expression syntax
|
||||
```js
|
||||
// Pattern
|
||||
if <condition> then <action> else <alternative-action>
|
||||
|
||||
// Example
|
||||
if color == "green" then "008000" else "ffffff"
|
||||
```
|
||||
|
||||
Conditional expressions are most useful in the following contexts:
|
||||
|
||||
- When defining variables.
|
||||
- When using functions that operate on a single row at a time (
|
||||
[`filter()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/filter/),
|
||||
[`map()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map/),
|
||||
[`reduce()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/reduce) ).
|
||||
|
||||
## Evaluating conditional expressions
|
||||
|
||||
Flux evaluates statements in order and stops evaluating once a condition matches.
|
||||
|
||||
For example, given the following statement:
|
||||
|
||||
```js
|
||||
if r._value > 95.0000001 and r._value <= 100.0 then "critical"
|
||||
else if r._value > 85.0000001 and r._value <= 95.0 then "warning"
|
||||
else if r._value > 70.0000001 and r._value <= 85.0 then "high"
|
||||
else "normal"
|
||||
```
|
||||
|
||||
When `r._value` is 96, the output is "critical" and the remaining conditions are not evaluated.
|
||||
|
||||
## Examples
|
||||
|
||||
- [Conditionally set the value of a variable](#conditionally-set-the-value-of-a-variable)
|
||||
- [Create conditional filters](#create-conditional-filters)
|
||||
- [Conditionally transform column values with map()](#conditionally-transform-column-values-with-map)
|
||||
- [Conditionally increment a count with reduce()](#conditionally-increment-a-count-with-reduce)
|
||||
|
||||
### Conditionally set the value of a variable
|
||||
The following example sets the `overdue` variable based on the
|
||||
`dueDate` variable's relation to `now()`.
|
||||
|
||||
```js
|
||||
dueDate = 2019-05-01
|
||||
overdue = if dueDate < now() then true else false
|
||||
```
|
||||
|
||||
### Create conditional filters
|
||||
The following example uses an example `metric` variable to change how the query filters data.
|
||||
`metric` has three possible values:
|
||||
|
||||
- Memory
|
||||
- CPU
|
||||
- Disk
|
||||
|
||||
```js
|
||||
metric = "Memory"
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
if v.metric == "Memory"
|
||||
then r._measurement == "mem" and r._field == "used_percent"
|
||||
else if v.metric == "CPU"
|
||||
then r._measurement == "cpu" and r._field == "usage_user"
|
||||
else if v.metric == "Disk"
|
||||
then r._measurement == "disk" and r._field == "used_percent"
|
||||
else r._measurement != ""
|
||||
)
|
||||
```
|
||||
|
||||
### Conditionally transform column values with map()
|
||||
The following example uses the [`map()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map/)
|
||||
to conditionally transform column values.
|
||||
It sets the `level` column to a specific string based on `_value` column.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Comments](#)
|
||||
[Comments](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
|
||||
|> map(fn: (r) => ({
|
||||
r with
|
||||
level:
|
||||
if r._value >= 95.0000001 and r._value <= 100.0 then "critical"
|
||||
else if r._value >= 85.0000001 and r._value <= 95.0 then "warning"
|
||||
else if r._value >= 70.0000001 and r._value <= 85.0 then "high"
|
||||
else "normal"
|
||||
})
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
|
||||
|> map(fn: (r) => ({
|
||||
// Retain all existing columns in the mapped row
|
||||
r with
|
||||
// Set the level column value based on the _value column
|
||||
level:
|
||||
if r._value >= 95.0000001 and r._value <= 100.0 then "critical"
|
||||
else if r._value >= 85.0000001 and r._value <= 95.0 then "warning"
|
||||
else if r._value >= 70.0000001 and r._value <= 85.0 then "high"
|
||||
else "normal"
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
### Conditionally increment a count with reduce()
|
||||
The following example uses the [`aggregateWindow()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/)
|
||||
and [`reduce()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/reduce/)
|
||||
functions to count the number of records in every five minute window that exceed a defined threshold.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Comments](#)
|
||||
[Comments](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
threshold = 65.0
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
|
||||
|> aggregateWindow(
|
||||
every: 5m,
|
||||
fn: (column, tables=<-) => tables |> reduce(
|
||||
identity: {above_threshold_count: 0.0},
|
||||
fn: (r, accumulator) => ({
|
||||
above_threshold_count:
|
||||
if r._value >= threshold then accumulator.above_threshold_count + 1.0
|
||||
else accumulator.above_threshold_count + 0.0
|
||||
})
|
||||
)
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
threshold = 65.0
|
||||
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "mem" and r._field == "used_percent" )
|
||||
// Aggregate data into 5 minute windows using a custom reduce() function
|
||||
|> aggregateWindow(
|
||||
every: 5m,
|
||||
// Use a custom function in the fn parameter.
|
||||
// The aggregateWindow fn parameter requires 'column' and 'tables' parameters.
|
||||
fn: (column, tables=<-) => tables |> reduce(
|
||||
identity: {above_threshold_count: 0.0},
|
||||
fn: (r, accumulator) => ({
|
||||
// Conditionally increment above_threshold_count if
|
||||
// r.value exceeds the threshold
|
||||
above_threshold_count:
|
||||
if r._value >= threshold then accumulator.above_threshold_count + 1.0
|
||||
else accumulator.above_threshold_count + 0.0
|
||||
})
|
||||
)
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
|
@ -0,0 +1,69 @@
|
|||
---
|
||||
title: Query cumulative sum
|
||||
seotitle: Query cumulative sum in Flux
|
||||
list_title: Cumulative sum
|
||||
description: >
|
||||
Use the `cumulativeSum()` function to calculate a running total of values.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: Cumulative sum
|
||||
list_query_example: cumulative_sum
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/cumulativesum/
|
||||
v2: /influxdb/v2.0/query-data/flux/cumulativesum/
|
||||
---
|
||||
|
||||
Use the [`cumulativeSum()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/cumulativesum/)
|
||||
to calculate a running total of values.
|
||||
`cumulativeSum` sums the values of subsequent records and returns each row updated with the summed total.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content "half" %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
| ----- |:------:|
|
||||
| 0001 | 1 |
|
||||
| 0002 | 2 |
|
||||
| 0003 | 1 |
|
||||
| 0004 | 3 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content "half" %}}
|
||||
**`cumulativeSum()` returns:**
|
||||
|
||||
| _time | _value |
|
||||
| ----- |:------:|
|
||||
| 0001 | 1 |
|
||||
| 0002 | 3 |
|
||||
| 0003 | 4 |
|
||||
| 0004 | 7 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
{{% note %}}
|
||||
The examples below use the [example data variable](/enterprise_influxdb/v1.9/flux/guides/#example-data-variable).
|
||||
{{% /note %}}
|
||||
|
||||
##### Calculate the running total of values
|
||||
```js
|
||||
data
|
||||
|> cumulativeSum()
|
||||
```
|
||||
|
||||
## Use cumulativeSum() with aggregateWindow()
|
||||
[`aggregateWindow()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/)
|
||||
segments data into windows of time, aggregates data in each window into a single
|
||||
point, then removes the time-based segmentation.
|
||||
It is primarily used to downsample data.
|
||||
|
||||
`aggregateWindow()` expects an aggregate function that returns a single row for each time window.
|
||||
To use `cumulativeSum()` with `aggregateWindow`, use `sum` in `aggregateWindow()`,
|
||||
then calculate the running total of the aggregate values with `cumulativeSum()`.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
data
|
||||
|> aggregateWindow(every: 5m, fn: sum)
|
||||
|> cumulativeSum()
|
||||
```
|
|
@ -0,0 +1,96 @@
|
|||
---
|
||||
title: Execute Flux queries
|
||||
description: Use the InfluxDB CLI, API, and the Chronograf Data Explorer to execute Flux queries.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Execute Flux queries
|
||||
parent: Query with Flux
|
||||
weight: 1
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/flux/guides/executing-queries/
|
||||
v2: /influxdb/v2.0/query-data/execute-queries/
|
||||
---
|
||||
|
||||
There are multiple ways to execute Flux queries with InfluxDB and Chronograf v1.8+.
|
||||
This guide covers the different options:
|
||||
|
||||
1. [Chronograf's Data Explorer](#chronograf-s-data-explorer)
|
||||
2. [Influx CLI](#influx-cli)
|
||||
3. [InfluxDB API](#influxdb-api)
|
||||
|
||||
> Before attempting these methods, make sure Flux is enabled by setting
|
||||
> `flux-enabled = true` in the `[http]` section of your InfluxDB configuration file.
|
||||
|
||||
## Chronograf's Data Explorer
|
||||
Chronograf v1.8+ supports Flux in its Data Explorer.
|
||||
Flux queries can be built, executed, and visualized from within the Chronograf user interface.
|
||||
|
||||
{{% note %}}
|
||||
If [authentication is enabled](/enterprise_influxdb/v1.9/administration/authentication_and_authorization)
|
||||
on your InfluxDB instance, use the `-username` flag to provide your InfluxDB username and
|
||||
the `-password` flag to provide your password.
|
||||
{{% /note %}}
|
||||
|
||||
### Submit a Flux query via via STDIN
|
||||
Flux queries an be piped into the `influx` CLI via STDIN.
|
||||
Query results are otuput in your terminal.
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Auth](#)
|
||||
[Auth Enabled](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
echo '<flux query>' | influx -type=flux
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
echo '<flux query>' | influx -type=flux -username myuser -password PasSw0rd
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
## InfluxDB API
|
||||
Flux can be used to query InfluxDB through InfluxDB's `/api/v2/query` endpoint.
|
||||
Queried data is returned in annotated CSV format.
|
||||
|
||||
In your request, set the following:
|
||||
|
||||
- `Accept` header to `application/csv`
|
||||
- `Content-type` header to `application/vnd.flux`
|
||||
- If [authentication is enabled](/enterprise_influxdb/v1.9/administration/authentication_and_authorization)
|
||||
on your InfluxDB instance, `Authorization` header to `Token <username>:<password>`
|
||||
|
||||
This allows you to POST the Flux query in plain text and receive the annotated CSV response.
|
||||
|
||||
Below is an example `curl` command that queries InfluxDB using Flux:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[No Auth](#)
|
||||
[Auth Enabled](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
curl -XPOST localhost:8086/api/v2/query -sS \
|
||||
-H 'Accept:application/csv' \
|
||||
-H 'Content-type:application/vnd.flux' \
|
||||
-d 'from(bucket:"telegraf")
|
||||
|> range(start:-5m)
|
||||
|> filter(fn:(r) => r._measurement == "cpu")'
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```bash
|
||||
curl -XPOST localhost:8086/api/v2/query -sS \
|
||||
-H 'Accept:application/csv' \
|
||||
-H 'Content-type:application/vnd.flux' \
|
||||
-H 'Authorization: Token <username>:<password>' \
|
||||
-d 'from(bucket:"telegraf")
|
||||
|> range(start:-5m)
|
||||
|> filter(fn:(r) => r._measurement == "cpu")'
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
|
@ -0,0 +1,82 @@
|
|||
---
|
||||
title: Check if a value exists
|
||||
seotitle: Use Flux to check if a value exists
|
||||
list_title: Exists
|
||||
description: >
|
||||
Use the Flux `exists` operator to check if a record contains a key or if that
|
||||
key's value is `null`.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Exists
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/exists/
|
||||
v2: /influxdb/v2.0/query-data/flux/exists/
|
||||
list_code_example: |
|
||||
##### Filter null values
|
||||
```js
|
||||
data
|
||||
|> filter(fn: (r) => exists r._value)
|
||||
```
|
||||
---
|
||||
|
||||
Use the Flux `exists` operator to check if a record contains a key or if that
|
||||
key's value is `null`.
|
||||
|
||||
```js
|
||||
p = {firstName: "John", lastName: "Doe", age: 42}
|
||||
|
||||
exists p.firstName
|
||||
// Returns true
|
||||
|
||||
exists p.height
|
||||
// Returns false
|
||||
```
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
Use `exists` with row functions (
|
||||
[`filter()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/filter/),
|
||||
[`map()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map/),
|
||||
[`reduce()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/reduce/))
|
||||
to check if a row includes a column or if the value for that column is `null`.
|
||||
|
||||
#### Filter null values
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) => exists r._value)
|
||||
```
|
||||
|
||||
#### Map values based on existence
|
||||
```js
|
||||
from(bucket: "default")
|
||||
|> range(start: -30s)
|
||||
|> map(fn: (r) => ({
|
||||
r with
|
||||
human_readable:
|
||||
if exists r._value then "${r._field} is ${string(v:r._value)}."
|
||||
else "${r._field} has no value."
|
||||
}))
|
||||
```
|
||||
|
||||
#### Ignore null values in a custom aggregate function
|
||||
```js
|
||||
customSumProduct = (tables=<-) =>
|
||||
tables
|
||||
|> reduce(
|
||||
identity: {sum: 0.0, product: 1.0},
|
||||
fn: (r, accumulator) => ({
|
||||
r with
|
||||
sum:
|
||||
if exists r._value then r._value + accumulator.sum
|
||||
else accumulator.sum,
|
||||
product:
|
||||
if exists r._value then r.value * accumulator.product
|
||||
else accumulator.product
|
||||
})
|
||||
)
|
||||
```
|
|
@ -0,0 +1,112 @@
|
|||
---
|
||||
title: Fill null values in data
|
||||
seotitle: Fill null values in data
|
||||
list_title: Fill
|
||||
description: >
|
||||
Use the `fill()` function to replace _null_ values.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: Fill
|
||||
list_query_example: fill_null
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/fill/
|
||||
v2: /influxdb/v2.0/query-data/flux/fill/
|
||||
---
|
||||
|
||||
Use the [`fill()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/fill/)
|
||||
to replace _null_ values with:
|
||||
|
||||
- [the previous non-null value](#fill-with-the-previous-value)
|
||||
- [a specified value](#fill-with-a-specified-value)
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
data
|
||||
|> fill(usePrevious: true)
|
||||
|
||||
// OR
|
||||
|
||||
data
|
||||
|> fill(value: 0.0)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
#### Fill empty windows of time
|
||||
The `fill()` function **does not** fill empty windows of time.
|
||||
It only replaces _null_ values in existing data.
|
||||
Filling empty windows of time requires time interpolation
|
||||
_(see [influxdata/flux#2428](https://github.com/influxdata/flux/issues/2428))_.
|
||||
{{% /note %}}
|
||||
|
||||
## Fill with the previous value
|
||||
To fill _null_ values with the previous **non-null** value, set the `usePrevious` parameter to `true`.
|
||||
|
||||
{{% note %}}
|
||||
Values remain _null_ if there is no previous non-null value in the table.
|
||||
{{% /note %}}
|
||||
|
||||
```js
|
||||
data
|
||||
|> fill(usePrevious: true)
|
||||
```
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | null |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | null |
|
||||
| 2020-01-01T00:04:00Z | null |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`fill(usePrevious: true)` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | null |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | 0.8 |
|
||||
| 2020-01-01T00:04:00Z | 0.8 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
## Fill with a specified value
|
||||
To fill _null_ values with a specified value, use the `value` parameter to specify the fill value.
|
||||
_The fill value must match the [data type](/{{< latest "influxdb" "v2" >}}/reference/flux/language/types/#basic-types)
|
||||
of the [column](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/fill/#column)._
|
||||
|
||||
```js
|
||||
data
|
||||
|> fill(value: 0.0)
|
||||
```
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | null |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | null |
|
||||
| 2020-01-01T00:04:00Z | null |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`fill(value: 0.0)` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 0.0 |
|
||||
| 2020-01-01T00:02:00Z | 0.8 |
|
||||
| 2020-01-01T00:03:00Z | 0.0 |
|
||||
| 2020-01-01T00:04:00Z | 0.0 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,149 @@
|
|||
---
|
||||
title: Query first and last values
|
||||
seotitle: Query first and last values in Flux
|
||||
list_title: First and last
|
||||
description: >
|
||||
Use the `first()` or `last()` functions to return the first or last point in an input table.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: First & last
|
||||
list_query_example: first_last
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/first-last/
|
||||
v2: /influxdb/v2.0/query-data/flux/first-last/
|
||||
---
|
||||
|
||||
Use the [`first()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/first/) or
|
||||
[`last()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/last/) functions
|
||||
to return the first or last record in an input table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> first()
|
||||
|
||||
// OR
|
||||
|
||||
data
|
||||
|> last()
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
By default, InfluxDB returns results sorted by time, however you can use the
|
||||
[`sort()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/sort/)
|
||||
to change how results are sorted.
|
||||
`first()` and `last()` respect the sort order of input data and return records
|
||||
based on the order they are received in.
|
||||
{{% /note %}}
|
||||
|
||||
### first
|
||||
`first()` returns the first non-null record in an input table.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following function returns:**
|
||||
```js
|
||||
|> first()
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### last
|
||||
`last()` returns the last non-null record in an input table.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following function returns:**
|
||||
|
||||
```js
|
||||
|> last()
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
## Use first() or last() with aggregateWindow()
|
||||
Use `first()` and `last()` with [`aggregateWindow()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/)
|
||||
to select the first or last records in time-based groups.
|
||||
`aggregateWindow()` segments data into windows of time, aggregates data in each window into a single
|
||||
point using aggregate or selector functions, and then removes the time-based segmentation.
|
||||
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 10 |
|
||||
| 2020-01-01T00:00:15Z | 12 |
|
||||
| 2020-01-01T00:00:45Z | 9 |
|
||||
| 2020-01-01T00:01:05Z | 9 |
|
||||
| 2020-01-01T00:01:10Z | 15 |
|
||||
| 2020-01-01T00:02:30Z | 11 |
|
||||
{{% /flex-content %}}
|
||||
|
||||
{{% flex-content %}}
|
||||
**The following function returns:**
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[first](#)
|
||||
[last](#)
|
||||
{{% /code-tabs %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
|> aggregateWindow(
|
||||
every: 1h,
|
||||
fn: first
|
||||
)
|
||||
```
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:59Z | 10 |
|
||||
| 2020-01-01T00:01:59Z | 9 |
|
||||
| 2020-01-01T00:02:59Z | 11 |
|
||||
{{% /code-tab-content %}}
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
|> aggregateWindow(
|
||||
every: 1h,
|
||||
fn: last
|
||||
)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:59Z | 9 |
|
||||
| 2020-01-01T00:01:59Z | 15 |
|
||||
| 2020-01-01T00:02:59Z | 11 |
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
{{%/flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,151 @@
|
|||
---
|
||||
title: Use Flux in Chronograf dashboards
|
||||
description: >
|
||||
This guide walks through using Flux queries in Chronograf dashboard cells,
|
||||
what template variables are available, and how to use them.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Use Flux in dashboards
|
||||
parent: Query with Flux
|
||||
weight: 30
|
||||
canonical: /{{< latest "influxdb" "v1" >}}/flux/guides/flux-in-dashboards/
|
||||
---
|
||||
|
||||
[Chronograf](/{{< latest "chronograf" >}}/) is the web user interface for managing for the
|
||||
InfluxData platform that lest you create and customize dashboards that visualize your data.
|
||||
Visualized data is retrieved using either an InfluxQL or Flux query.
|
||||
This guide walks through using Flux queries in Chronograf dashboard cells.
|
||||
|
||||
## Using Flux in dashboard cells
|
||||
|
||||
---
|
||||
|
||||
_**Chronograf v1.8+** and **InfluxDB v1.8 with [Flux enabled](/enterprise_influxdb/v1.9/flux/installation)**
|
||||
are required to use Flux in dashboards._
|
||||
|
||||
---
|
||||
|
||||
To use Flux in a dashboard cell, either create a new cell or edit an existing cell
|
||||
by clicking the **pencil** icon in the top right corner of the cell.
|
||||
To the right of the **Source dropdown** above the graph preview, select **Flux** as the source type.
|
||||
|
||||
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-cell.png" alt="Flux in Chronograf dashboard cells" />}}
|
||||
|
||||
> The Flux source type is only available if your data source has
|
||||
> [Flux enabled](/enterprise_influxdb/v1.9/flux/installation).
|
||||
|
||||
This will provide **Schema**, **Script**, and **Functions** panes.
|
||||
|
||||
### Schema pane
|
||||
The Schema pane allows you to explore your data and add filters for specific
|
||||
measurements, fields, and tags to your Flux script.
|
||||
|
||||
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-add-filter.png" title="Add a filter from the Schema panel" />}}
|
||||
|
||||
### Script pane
|
||||
The Script pane is where you write your Flux script.
|
||||
In its default state, the **Script** pane includes an optional [Script Wizard](/chronograf/v1.8/guides/querying-data/#explore-data-with-flux)
|
||||
that uses selected options to build a Flux query for you.
|
||||
The generated query includes all the relevant functions and [template variables](#template-variables-in-flux)
|
||||
required to return your desired data.
|
||||
|
||||
### Functions pane
|
||||
The Functions pane provides a list of functions available in your Flux queries.
|
||||
Clicking on a function will add it to the end of the script in the Script pane.
|
||||
Hovering over a function provides documentation for the function as well as links
|
||||
to deep documentation.
|
||||
|
||||
### Dynamic sources
|
||||
Chronograf can be configured with multiple data sources.
|
||||
The **Sources dropdown** allows you to select a specific data source to connect to,
|
||||
but a **Dynamic Source** options is also available.
|
||||
With a dynamic source, the cell will query data from whatever data source to which
|
||||
Chronograf is currently connected.
|
||||
Connections are managed under Chronograf's **Configuration** tab.
|
||||
|
||||
### View raw data
|
||||
As you're building your Flux scripts, each function processes or transforms your
|
||||
data is ways specific to the function.
|
||||
It can be helpful to view the actual data in order to see how it is being shaped.
|
||||
The **View Raw Data** toggle above the data visualization switches between graphed
|
||||
data and raw data shown in table form.
|
||||
|
||||
{{< img-hd src="/img/influxdb/1-7-flux-dashboard-view-raw.png" alt="View raw data" />}}
|
||||
|
||||
_The **View Raw Data** toggle is only available when using Flux._
|
||||
|
||||
## Template variables in Flux
|
||||
Chronograf [template variables](/{{< latest "chronograf" >}}/guides/dashboard-template-variables/)
|
||||
allow you to alter specific components of cells’ queries using elements provided in the
|
||||
Chronograf user interface.
|
||||
|
||||
In your Flux query, reference template variables just as you would reference defined Flux variables.
|
||||
The following example uses Chronograf's [predefined template variables](#predefined-template-variables),
|
||||
`dashboardTime`, `upperDashboardTime`, and `autoInterval`:
|
||||
|
||||
```js
|
||||
from(bucket: "telegraf/autogen")
|
||||
|> filter(fn: (r) => r._measurement == "cpu")
|
||||
|> range(
|
||||
start: dashboardTime,
|
||||
stop: upperDashboardTime
|
||||
)
|
||||
window(every: autoInterval)
|
||||
```
|
||||
|
||||
### Predefined template variables
|
||||
|
||||
#### dashboardTime
|
||||
The `dashboardTime` template variable represents the lower time bound of ranged data.
|
||||
It's value is controlled by the time dropdown in your dashboard.
|
||||
It should be used to define the `start` parameter of the `range()` function.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> range(
|
||||
start: dashboardTime
|
||||
)
|
||||
```
|
||||
|
||||
#### upperDashboardTime
|
||||
The `upperDashboardTime` template variable represents the upper time bound of ranged data.
|
||||
It's value is modified by the time dropdown in your dashboard when using an absolute time range.
|
||||
It should be used to define the `stop` parameter of the `range()` function.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> range(
|
||||
start: dashboardTime,
|
||||
stop: upperDashboardTime
|
||||
)
|
||||
```
|
||||
> As a best practice, always set the `stop` parameter of the `range()` function to `upperDashboardTime` in cell queries.
|
||||
> Without it, `stop` defaults to "now" and the absolute upper range bound selected in the time dropdown is not honored,
|
||||
> potentially causing unnecessary load on InfluxDB.
|
||||
|
||||
#### autoInterval
|
||||
The `autoInterval` template variable represents the refresh interval of the dashboard
|
||||
and is controlled by the refresh interval dropdown.
|
||||
It's typically used to align window intervals created in
|
||||
[windowing and aggregation](/enterprise_influxdb/v1.9/flux/guides/window-aggregate) operations with dashboard refreshes.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> range(
|
||||
start: dashboardTime,
|
||||
stop: upperDashboardTime
|
||||
)
|
||||
|> aggregateWindow(
|
||||
every: autoInterval,
|
||||
fn: mean
|
||||
)
|
||||
```
|
||||
|
||||
### Custom template variables
|
||||
<% warn %>
|
||||
Chronograf does not support the use of custom template variables in Flux queries.
|
||||
<% /warn %>
|
||||
|
||||
## Using Flux and InfluxQL
|
||||
Within individual dashboard cells, the use of Flux and InfluxQL is mutually exclusive.
|
||||
However, a dashboard may consist of different cells, each using Flux or InfluxQL.
|
|
@ -0,0 +1,91 @@
|
|||
---
|
||||
title: Work with geo-temporal data
|
||||
list_title: Geo-temporal data
|
||||
description: >
|
||||
Use the Flux Geo package to filter geo-temporal data and group by geographic location or track.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Geo-temporal data
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|
||||
|> geo.groupByArea(newColumn: "geoArea", level: 5)
|
||||
```
|
||||
---
|
||||
|
||||
Use the [Flux Geo package](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo) to
|
||||
filter geo-temporal data and group by geographic location or track.
|
||||
|
||||
{{% warn %}}
|
||||
The Geo package is experimental and subject to change at any time.
|
||||
By using it, you agree to the [risks of experimental functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/#use-experimental-functions-at-your-own-risk).
|
||||
{{% /warn %}}
|
||||
|
||||
**To work with geo-temporal data:**
|
||||
|
||||
1. Import the `experimental/geo` package.
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
```
|
||||
|
||||
2. Load geo-temporal data. _See below for [sample geo-temporal data](#sample-data)._
|
||||
3. Do one or more of the following:
|
||||
|
||||
- [Shape data to work with the Geo package](#shape-data-to-work-with-the-geo-package)
|
||||
- [Filter data by region](#filter-geo-temporal-data-by-region) (using strict or non-strict filters)
|
||||
- [Group data by area or by track](#group-geo-temporal-data)
|
||||
|
||||
{{< children >}}
|
||||
|
||||
---
|
||||
|
||||
## Sample data
|
||||
Many of the examples in this section use a `sampleGeoData` variable that represents
|
||||
a sample set of geo-temporal data.
|
||||
The [Bird Migration Sample Data](https://github.com/influxdata/influxdb2-sample-data/tree/master/bird-migration-data)
|
||||
available on GitHub provides sample geo-temporal data that meets the
|
||||
[requirements of the Flux Geo package](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/#geo-schema-requirements).
|
||||
|
||||
### Load annotated CSV sample data
|
||||
Use the [experimental `csv.from()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/csv/from/)
|
||||
to load the sample bird migration annotated CSV data from GitHub:
|
||||
|
||||
```js
|
||||
import `experimental/csv`
|
||||
|
||||
sampleGeoData = csv.from(
|
||||
url: "https://github.com/influxdata/influxdb2-sample-data/blob/master/bird-migration-data/bird-migration.csv"
|
||||
)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
`csv.from(url: ...)` downloads sample data each time you execute the query **(~1.3 MB)**.
|
||||
If bandwidth is a concern, use the [`to()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/outputs/to/)
|
||||
to write the data to a bucket, and then query the bucket with [`from()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/inputs/from/).
|
||||
{{% /note %}}
|
||||
|
||||
### Write sample data to InfluxDB with line protocol
|
||||
Use `curl` and the `influx write` command to write bird migration line protocol to InfluxDB.
|
||||
Replace `db/rp` with your destination bucket:
|
||||
|
||||
```sh
|
||||
curl https://raw.githubusercontent.com/influxdata/influxdb2-sample-data/master/bird-migration-data/bird-migration.line --output ./tmp-data
|
||||
influx write -b db/rp @./tmp-data
|
||||
rm -f ./tmp-data
|
||||
```
|
||||
|
||||
Use Flux to query the bird migration data and assign it to the `sampleGeoData` variable:
|
||||
|
||||
```js
|
||||
sampleGeoData = from(bucket: "db/rp")
|
||||
|> range(start: 2019-01-01T00:00:00Z, stop: 2019-12-31T23:59:59Z)
|
||||
|> filter(fn: (r) => r._measurement == "migration")
|
||||
```
|
|
@ -0,0 +1,132 @@
|
|||
---
|
||||
title: Filter geo-temporal data by region
|
||||
description: >
|
||||
Use the `geo.filterRows` function to filter geo-temporal data by box-shaped, circular, or polygonal geographic regions.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Filter by region
|
||||
parent: Geo-temporal data
|
||||
weight: 302
|
||||
related:
|
||||
- /{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/
|
||||
- /{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/filterrows/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/filter-by-region/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/filter-by-region/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(
|
||||
region: {lat: 30.04, lon: 31.23, radius: 200.0},
|
||||
strict: true
|
||||
)
|
||||
```
|
||||
---
|
||||
|
||||
Use the [`geo.filterRows` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/filterrows/)
|
||||
to filter geo-temporal data by geographic region:
|
||||
|
||||
1. [Define a geographic region](#define-a-geographic-region)
|
||||
2. [Use strict or non-strict filtering](#strict-and-non-strict-filtering)
|
||||
|
||||
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.9/flux/guides/geo/#sample-data)
|
||||
and queries data points **within 200km of Cairo, Egypt**:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(
|
||||
region: {lat: 30.04, lon: 31.23, radius: 200.0},
|
||||
strict: true
|
||||
)
|
||||
```
|
||||
|
||||
## Define a geographic region
|
||||
Many functions in the Geo package filter data based on geographic region.
|
||||
Define a geographic region using one of the the following shapes:
|
||||
|
||||
- [box](#box)
|
||||
- [circle](#circle)
|
||||
- [polygon](#polygon)
|
||||
|
||||
### box
|
||||
Define a box-shaped region by specifying a record containing the following properties:
|
||||
|
||||
- **minLat:** minimum latitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **maxLat:** maximum latitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **minLon:** minimum longitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **maxLon:** maximum longitude in decimal degrees (WGS 84) _(Float)_
|
||||
|
||||
##### Example box-shaped region
|
||||
```js
|
||||
{
|
||||
minLat: 40.51757813,
|
||||
maxLat: 40.86914063,
|
||||
minLon: -73.65234375,
|
||||
maxLon: -72.94921875
|
||||
}
|
||||
```
|
||||
|
||||
### circle
|
||||
Define a circular region by specifying a record containing the following properties:
|
||||
|
||||
- **lat**: latitude of the circle center in decimal degrees (WGS 84) _(Float)_
|
||||
- **lon**: longitude of the circle center in decimal degrees (WGS 84) _(Float)_
|
||||
- **radius**: radius of the circle in kilometers (km) _(Float)_
|
||||
|
||||
##### Example circular region
|
||||
```js
|
||||
{
|
||||
lat: 40.69335938,
|
||||
lon: -73.30078125,
|
||||
radius: 20.0
|
||||
}
|
||||
```
|
||||
|
||||
### polygon
|
||||
Define a polygonal region with a record containing the latitude and longitude for
|
||||
each point in the polygon:
|
||||
|
||||
- **points**: points that define the custom polygon _(Array of records)_
|
||||
|
||||
Define each point with a record containing the following properties:
|
||||
|
||||
- **lat**: latitude in decimal degrees (WGS 84) _(Float)_
|
||||
- **lon**: longitude in decimal degrees (WGS 84) _(Float)_
|
||||
|
||||
##### Example polygonal region
|
||||
```js
|
||||
{
|
||||
points: [
|
||||
{lat: 40.671659, lon: -73.936631},
|
||||
{lat: 40.706543, lon: -73.749177},
|
||||
{lat: 40.791333, lon: -73.880327}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Strict and non-strict filtering
|
||||
In most cases, the specified geographic region does not perfectly align with S2 grid cells.
|
||||
|
||||
- **Non-strict filtering** returns points that may be outside of the specified region but
|
||||
inside S2 grid cells partially covered by the region.
|
||||
- **Strict filtering** returns only points inside the specified region.
|
||||
|
||||
_Strict filtering is less performant, but more accurate than non-strict filtering._
|
||||
|
||||
<span class="key-geo-cell"></span> S2 grid cell
|
||||
<span class="key-geo-region"></span> Filter region
|
||||
<span class="key-geo-point"></span> Returned point
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Strict filtering**
|
||||
{{< svg "/static/svgs/geo-strict.svg" >}}
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**Non-strict filtering**
|
||||
{{< svg "/static/svgs/geo-non-strict.svg" >}}
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,76 @@
|
|||
---
|
||||
title: Group geo-temporal data
|
||||
description: >
|
||||
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
|
||||
to group data into tracks or routes.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Geo-temporal data
|
||||
weight: 302
|
||||
related:
|
||||
- /{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/
|
||||
- /{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/groupbyarea/
|
||||
- /{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/astracks/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/group-geo-data/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/group-geo-data/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.groupByArea(newColumn: "geoArea", level: 5)
|
||||
|> geo.asTracks(groupBy: ["id"],sortBy: ["_time"])
|
||||
```
|
||||
---
|
||||
|
||||
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
|
||||
to group data into tracks or routes.
|
||||
|
||||
- [Group data by area](#group-data-by-area)
|
||||
- [Group data into tracks or routes](#group-data-by-track-or-route)
|
||||
|
||||
### Group data by area
|
||||
Use the [`geo.groupByArea()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/groupbyarea/)
|
||||
to group geo-temporal data points by geographic area.
|
||||
Areas are determined by [S2 grid cells](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
|
||||
|
||||
- Specify a new column to store the unique area identifier for each point with the `newColumn` parameter.
|
||||
- Specify the [S2 cell level](https://s2geometry.io/resources/s2cell_statistics)
|
||||
to use when calculating geographic areas with the `level` parameter.
|
||||
|
||||
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.9/flux/guides/geo/#sample-data)
|
||||
to query data points within 200km of Cairo, Egypt and group them by geographic area:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|
||||
|> geo.groupByArea(
|
||||
newColumn: "geoArea",
|
||||
level: 5
|
||||
)
|
||||
```
|
||||
|
||||
### Group data by track or route
|
||||
Use [`geo.asTracks()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/astracks/)
|
||||
to group data points into tracks or routes and order them by time or other columns.
|
||||
Data must contain a unique identifier for each track. For example: `id` or `tid`.
|
||||
|
||||
- Specify columns that uniquely identify each track or route with the `groupBy` parameter.
|
||||
- Specify which columns to sort by with the `sortBy` parameter. Default is `["_time"]`.
|
||||
|
||||
The following example uses the [sample bird migration data](/enterprise_influxdb/v1.9/flux/guides/geo/#sample-data)
|
||||
to query data points within 200km of Cairo, Egypt and group them into routes unique
|
||||
to each bird:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|
||||
|> geo.asTracks(
|
||||
groupBy: ["id"],
|
||||
sortBy: ["_time"]
|
||||
)
|
||||
```
|
|
@ -0,0 +1,122 @@
|
|||
---
|
||||
title: Shape data to work with the Geo package
|
||||
description: >
|
||||
Functions in the Flux Geo package require **lat** and **lon** fields and an **s2_cell_id** tag.
|
||||
Rename latitude and longitude fields and generate S2 cell ID tokens.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Shape geo-temporal data
|
||||
parent: Geo-temporal data
|
||||
weight: 301
|
||||
related:
|
||||
- /{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/
|
||||
- /{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/shapedata/
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/geo/shape-geo-data/
|
||||
v2: /influxdb/v2.0/query-data/flux/geo/shape-geo-data/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
sampleGeoData
|
||||
|> geo.shapeData(latField: "latitude", lonField: "longitude", level: 10)
|
||||
```
|
||||
---
|
||||
|
||||
Functions in the Geo package require the following data schema:
|
||||
|
||||
- an **s2_cell_id** tag containing the [S2 Cell ID](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
|
||||
**as a token**
|
||||
- a **`lat` field** field containing the **latitude in decimal degrees** (WGS 84)
|
||||
- a **`lon` field** field containing the **longitude in decimal degrees** (WGS 84)
|
||||
|
||||
## Shape geo-temporal data
|
||||
If your data already contains latitude and longitude fields, use the
|
||||
[`geo.shapeData()`function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/shapedata/)
|
||||
to rename the fields to match the requirements of the Geo package, pivot the data
|
||||
into row-wise sets, and generate S2 cell ID tokens for each point.
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
from(bucket: "example-bucket")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement")
|
||||
|> geo.shapeData(
|
||||
latField: "latitude",
|
||||
lonField: "longitude",
|
||||
level: 10
|
||||
)
|
||||
```
|
||||
|
||||
## Generate S2 cell ID tokens
|
||||
The Geo package uses the [S2 Geometry Library](https://s2geometry.io/) to represent
|
||||
geographic coordinates on a three-dimensional sphere.
|
||||
The sphere is divided into [cells](https://s2geometry.io/devguide/s2cell_hierarchy),
|
||||
each with a unique 64-bit identifier (S2 cell ID).
|
||||
Grid and S2 cell ID accuracy are defined by a [level](https://s2geometry.io/resources/s2cell_statistics).
|
||||
|
||||
{{% note %}}
|
||||
To filter more quickly, use higher S2 Cell ID levels,
|
||||
but know that that higher levels increase [series cardinality](/enterprise_influxdb/v1.9/concepts/glossary/#series-cardinality).
|
||||
{{% /note %}}
|
||||
|
||||
The Geo package requires S2 cell IDs as tokens.
|
||||
To generate add S2 cell IDs tokens to your data, use one of the following options:
|
||||
|
||||
- [Generate S2 cell ID tokens with Telegraf](#generate-s2-cell-id-tokens-with-telegraf)
|
||||
- [Generate S2 cell ID tokens language-specific libraries](#generate-s2-cell-id-tokens-language-specific-libraries)
|
||||
- [Generate S2 cell ID tokens with Flux](#generate-s2-cell-id-tokens-with-flux)
|
||||
|
||||
### Generate S2 cell ID tokens with Telegraf
|
||||
Enable the [Telegraf S2 Geo (`s2geo`) processor](https://github.com/influxdata/telegraf/tree/master/plugins/processors/s2geo)
|
||||
to generate S2 cell ID tokens at a specified `cell_level` using `lat` and `lon` field values.
|
||||
|
||||
Add the `processors.s2geo` configuration to your Telegraf configuration file (`telegraf.conf`):
|
||||
|
||||
```toml
|
||||
[[processors.s2geo]]
|
||||
## The name of the lat and lon fields containing WGS-84 latitude and
|
||||
## longitude in decimal degrees.
|
||||
lat_field = "lat"
|
||||
lon_field = "lon"
|
||||
|
||||
## New tag to create
|
||||
tag_key = "s2_cell_id"
|
||||
|
||||
## Cell level (see https://s2geometry.io/resources/s2cell_statistics.html)
|
||||
cell_level = 9
|
||||
```
|
||||
|
||||
Telegraf stores the S2 cell ID token in the `s2_cell_id` tag.
|
||||
|
||||
### Generate S2 cell ID tokens language-specific libraries
|
||||
Many programming languages offer S2 Libraries with methods for generating S2 cell ID tokens.
|
||||
Use latitude and longitude with the `s2.CellID.ToToken` endpoint of the S2 Geometry
|
||||
Library to generate `s2_cell_id` tags. For example:
|
||||
|
||||
- **Go:** [s2.CellID.ToToken()](https://godoc.org/github.com/golang/geo/s2#CellID.ToToken)
|
||||
- **Python:** [s2sphere.CellId.to_token()](https://s2sphere.readthedocs.io/en/latest/api.html#s2sphere.CellId)
|
||||
- **JavaScript:** [s2.cellid.toToken()](https://github.com/mapbox/node-s2/blob/master/API.md#cellidtotoken---string)
|
||||
|
||||
### Generate S2 cell ID tokens with Flux
|
||||
Use the [`geo.s2CellIDToken()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/s2cellidtoken/)
|
||||
with existing longitude (`lon`) and latitude (`lat`) field values to generate and add the S2 cell ID token.
|
||||
First, use the [`geo.toRows()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/torows/)
|
||||
to pivot **lat** and **lon** fields into row-wise sets:
|
||||
|
||||
```js
|
||||
import "experimental/geo"
|
||||
|
||||
from(bucket: "example-bucket")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement")
|
||||
|> geo.toRows()
|
||||
|> map(fn: (r) => ({ r with
|
||||
s2_cell_id: geo.s2CellIDToken(point: {lon: r.lon, lat: r.lat}, level: 10)
|
||||
}))
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
The [`geo.shapeData()`function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/geo/shapedata/)
|
||||
generates S2 cell ID tokens as well.
|
||||
{{% /note %}}
|
|
@ -0,0 +1,676 @@
|
|||
---
|
||||
title: Group data in InfluxDB with Flux
|
||||
list_title: Group
|
||||
description: >
|
||||
Use the `group()` function to group data with common values in specific columns.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Group
|
||||
parent: Query with Flux
|
||||
weight: 2
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/flux/guides/grouping-data/
|
||||
list_query_example: group
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/group-data/
|
||||
v2: /influxdb/v2.0/query-data/flux/group-data/
|
||||
---
|
||||
|
||||
With Flux, you can group data by any column in your queried data set.
|
||||
"Grouping" partitions data into tables in which each row shares a common value for specified columns.
|
||||
This guide walks through grouping data in Flux and provides examples of how data is shaped in the process.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Group keys
|
||||
Every table has a **group key** – a list of columns which for which every row in the table has the same value.
|
||||
|
||||
###### Example group key
|
||||
```js
|
||||
[_start, _stop, _field, _measurement, host]
|
||||
```
|
||||
|
||||
Grouping data in Flux is essentially defining the group key of output tables.
|
||||
Understanding how modifying group keys shapes output data is key to successfully
|
||||
grouping and transforming data into your desired output.
|
||||
|
||||
## group() Function
|
||||
Flux's [`group()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/group) defines the
|
||||
group key for output tables, i.e. grouping records based on values for specific columns.
|
||||
|
||||
###### group() example
|
||||
```js
|
||||
dataStream
|
||||
|> group(columns: ["cpu", "host"])
|
||||
```
|
||||
|
||||
###### Resulting group key
|
||||
```js
|
||||
[cpu, host]
|
||||
```
|
||||
|
||||
The `group()` function has the following parameters:
|
||||
|
||||
### columns
|
||||
The list of columns to include or exclude (depending on the [mode](#mode)) in the grouping operation.
|
||||
|
||||
### mode
|
||||
The method used to define the group and resulting group key.
|
||||
Possible values include `by` and `except`.
|
||||
|
||||
|
||||
## Example grouping operations
|
||||
To illustrate how grouping works, define a `dataSet` variable that queries System
|
||||
CPU usage from the `db/rp` bucket.
|
||||
Filter the `cpu` tag so it only returns results for each numbered CPU core.
|
||||
|
||||
### Data set
|
||||
CPU used by system operations for all numbered CPU cores.
|
||||
It uses a regular expression to filter only numbered cores.
|
||||
|
||||
```js
|
||||
dataSet = from(bucket: "db/rp")
|
||||
|> range(start: -2m)
|
||||
|> filter(fn: (r) =>
|
||||
r._field == "usage_system" and
|
||||
r.cpu =~ /cpu[0-9*]/
|
||||
)
|
||||
|> drop(columns: ["host"])
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
This example drops the `host` column from the returned data since the CPU data
|
||||
is only tracked for a single host and it simplifies the output tables.
|
||||
Don't drop the `host` column if monitoring multiple hosts.
|
||||
{{% /note %}}
|
||||
|
||||
{{% truncate %}}
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:00.000000000Z 7.892107892107892
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:10.000000000Z 7.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:20.000000000Z 7.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:30.000000000Z 5.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:40.000000000Z 7.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:34:50.000000000Z 7.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:00.000000000Z 10.3
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:10.000000000Z 9.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:20.000000000Z 8.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:30.000000000Z 8.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:40.000000000Z 8.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:35:50.000000000Z 10.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu0 2018-11-05T21:36:00.000000000Z 10.6
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:00.000000000Z 0.7992007992007992
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:10.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:20.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:30.000000000Z 0.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:40.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:34:50.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:00.000000000Z 1.4
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:10.000000000Z 1.2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:20.000000000Z 0.8
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:30.000000000Z 0.8991008991008991
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:40.000000000Z 0.8008008008008008
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:35:50.000000000Z 0.999000999000999
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu1 2018-11-05T21:36:00.000000000Z 1.1022044088176353
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:00.000000000Z 4.1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:10.000000000Z 3.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:20.000000000Z 3.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:30.000000000Z 2.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:40.000000000Z 4.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:34:50.000000000Z 4.895104895104895
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:00.000000000Z 6.906906906906907
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:10.000000000Z 5.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:20.000000000Z 5.1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:30.000000000Z 4.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:40.000000000Z 5.1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:35:50.000000000Z 5.9
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu2 2018-11-05T21:36:00.000000000Z 6.4935064935064934
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement, cpu]
|
||||
_start:time _stop:time _field:string _measurement:string cpu:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:00.000000000Z 0.5005005005005005
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:10.000000000Z 0.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:20.000000000Z 0.5
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:30.000000000Z 0.3
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:40.000000000Z 0.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:34:50.000000000Z 0.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:00.000000000Z 1.3986013986013985
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:10.000000000Z 0.9
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:20.000000000Z 0.5005005005005005
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:30.000000000Z 0.7
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:40.000000000Z 0.6
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:35:50.000000000Z 0.8
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z usage_system cpu cpu3 2018-11-05T21:36:00.000000000Z 0.9
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
**Note that the group key is output with each table: `Table: keys: <group-key>`.**
|
||||
|
||||

|
||||
|
||||
### Group by CPU
|
||||
Group the `dataSet` stream by the `cpu` column.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["cpu"])
|
||||
```
|
||||
|
||||
This won't actually change the structure of the data since it already has `cpu`
|
||||
in the group key and is therefore grouped by `cpu`.
|
||||
However, notice that it does change the group key:
|
||||
|
||||
{{% truncate %}}
|
||||
###### Group by CPU output tables
|
||||
```
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 7.892107892107892 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 7.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 5.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 7.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 10.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 9.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 8.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 8.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 8.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 10.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu0 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 0.7992007992007992 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 0.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 1.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 1.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 0.8991008991008991 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 0.8008008008008008 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 0.999000999000999 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu1 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 4.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 3.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 3.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 2.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 4.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 4.895104895104895 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 6.906906906906907 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 5.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 4.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 5.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu2 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [cpu]
|
||||
cpu:string _stop:time _time:time _value:float _field:string _measurement:string _start:time
|
||||
---------------------- ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:10.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:20.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:30.000000000Z 0.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:40.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:50.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:00.000000000Z 1.3986013986013985 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:10.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:20.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:30.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:40.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:35:50.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
cpu3 2018-11-05T21:36:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
The visualization remains the same.
|
||||
|
||||

|
||||
|
||||
### Group by time
|
||||
Grouping data by the `_time` column is a good illustration of how grouping changes the structure of your data.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["_time"])
|
||||
```
|
||||
|
||||
When grouping by `_time`, all records that share a common `_time` value are grouped into individual tables.
|
||||
So each output table represents a single point in time.
|
||||
|
||||
{{% truncate %}}
|
||||
###### Group by time output tables
|
||||
```
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.892107892107892 usage_system cpu cpu0
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7992007992007992 usage_system cpu cpu1
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.1 usage_system cpu cpu2
|
||||
2018-11-05T21:34:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.2 usage_system cpu cpu0
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 3.6 usage_system cpu cpu2
|
||||
2018-11-05T21:34:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu cpu0
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 3.5 usage_system cpu cpu2
|
||||
2018-11-05T21:34:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.5 usage_system cpu cpu0
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.4 usage_system cpu cpu1
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 2.6 usage_system cpu cpu2
|
||||
2018-11-05T21:34:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.3 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu cpu0
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.5 usage_system cpu cpu2
|
||||
2018-11-05T21:34:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 7.5 usage_system cpu cpu0
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu1
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.895104895104895 usage_system cpu cpu2
|
||||
2018-11-05T21:34:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.3 usage_system cpu cpu0
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.4 usage_system cpu cpu1
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.906906906906907 usage_system cpu cpu2
|
||||
2018-11-05T21:35:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.3986013986013985 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 9.2 usage_system cpu cpu0
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.2 usage_system cpu cpu1
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.7 usage_system cpu cpu2
|
||||
2018-11-05T21:35:10.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.4 usage_system cpu cpu0
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu cpu1
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu cpu2
|
||||
2018-11-05T21:35:20.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.5 usage_system cpu cpu0
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8991008991008991 usage_system cpu cpu1
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 4.7 usage_system cpu cpu2
|
||||
2018-11-05T21:35:30.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 8.6 usage_system cpu cpu0
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8008008008008008 usage_system cpu cpu1
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu cpu2
|
||||
2018-11-05T21:35:40.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.2 usage_system cpu cpu0
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.999000999000999 usage_system cpu cpu1
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 5.9 usage_system cpu cpu2
|
||||
2018-11-05T21:35:50.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu cpu3
|
||||
|
||||
Table: keys: [_time]
|
||||
_time:time _start:time _stop:time _value:float _field:string _measurement:string cpu:string
|
||||
------------------------------ ------------------------------ ------------------------------ ---------------------------- ---------------------- ---------------------- ----------------------
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu cpu0
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu cpu1
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu cpu2
|
||||
2018-11-05T21:36:00.000000000Z 2018-11-05T21:34:00.000000000Z 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu cpu3
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
Because each timestamp is structured as a separate table, when visualized, all
|
||||
points that share the same timestamp appear connected.
|
||||
|
||||

|
||||
|
||||
{{% note %}}
|
||||
With some further processing, you could calculate the average CPU usage across all CPUs per point
|
||||
of time and group them into a single table, but we won't cover that in this example.
|
||||
If you're interested in running and visualizing this yourself, here's what the query would look like:
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["_time"])
|
||||
|> mean()
|
||||
|> group(columns: ["_value", "_time"], mode: "except")
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
## Group by CPU and time
|
||||
Group by the `cpu` and `_time` columns.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> group(columns: ["cpu", "_time"])
|
||||
```
|
||||
|
||||
This outputs a table for every unique `cpu` and `_time` combination:
|
||||
|
||||
{{% truncate %}}
|
||||
###### Group by CPU and time output tables
|
||||
```
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.892107892107892 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7992007992007992 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 3.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:10.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 3.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:20.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 5.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 2.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:30.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:40.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 7.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.895104895104895 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:34:50.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.3 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 6.906906906906907 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 1.3986013986013985 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 9.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:10.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.4 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:20.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.5005005005005005 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.5 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8991008991008991 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 4.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:30.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.7 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 8.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.8008008008008008 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.1 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:40.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.2 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 0.999000999000999 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 5.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:35:50.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.8 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu0 2018-11-05T21:36:00.000000000Z 10.6 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu1 2018-11-05T21:36:00.000000000Z 1.1022044088176353 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu2 2018-11-05T21:36:00.000000000Z 6.4935064935064934 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
|
||||
Table: keys: [_time, cpu]
|
||||
_time:time cpu:string _stop:time _value:float _field:string _measurement:string _start:time
|
||||
------------------------------ ---------------------- ------------------------------ ---------------------------- ---------------------- ---------------------- ------------------------------
|
||||
2018-11-05T21:36:00.000000000Z cpu3 2018-11-05T21:36:00.000000000Z 0.9 usage_system cpu 2018-11-05T21:34:00.000000000Z
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
When visualized, tables appear as individual, unconnected points.
|
||||
|
||||

|
||||
|
||||
Grouping by `cpu` and `_time` is a good illustration of how grouping works.
|
||||
|
||||
## In conclusion
|
||||
Grouping is a powerful way to shape your data into your desired output format.
|
||||
It modifies the group keys of output tables, grouping records into tables that
|
||||
all share common values within specified columns.
|
|
@ -0,0 +1,142 @@
|
|||
---
|
||||
title: Create histograms with Flux
|
||||
list_title: Histograms
|
||||
description: >
|
||||
Use the `histogram()` function to create cumulative histograms with Flux.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Histograms
|
||||
parent: Query with Flux
|
||||
weight: 10
|
||||
list_query_example: histogram
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/histograms/
|
||||
v2: /influxdb/v2.0/query-data/flux/histograms/
|
||||
---
|
||||
|
||||
Histograms provide valuable insight into the distribution of your data.
|
||||
This guide walks through using Flux's `histogram()` function to transform your data into a **cumulative histogram**.
|
||||
|
||||
## histogram() function
|
||||
The [`histogram()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/histogram) approximates the
|
||||
cumulative distribution of a dataset by counting data frequencies for a list of "bins."
|
||||
A **bin** is simply a range in which a data point falls.
|
||||
All data points that are less than or equal to the bound are counted in the bin.
|
||||
In the histogram output, a column is added (le) that represents the upper bounds of of each bin.
|
||||
Bin counts are cumulative.
|
||||
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used_percent"
|
||||
)
|
||||
|> histogram(bins: [0.0, 10.0, 20.0, 30.0])
|
||||
```
|
||||
|
||||
> Values output by the `histogram` function represent points of data aggregated over time.
|
||||
> Since values do not represent single points in time, there is no `_time` column in the output table.
|
||||
|
||||
## Bin helper functions
|
||||
Flux provides two helper functions for generating histogram bins.
|
||||
Each generates and outputs an array of floats designed to be used in the `histogram()` function's `bins` parameter.
|
||||
|
||||
### linearBins()
|
||||
The [`linearBins()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/misc/linearbins) generates a list of linearly separated floats.
|
||||
|
||||
```js
|
||||
linearBins(start: 0.0, width: 10.0, count: 10)
|
||||
|
||||
// Generated list: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, +Inf]
|
||||
```
|
||||
|
||||
### logarithmicBins()
|
||||
The [`logarithmicBins()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/misc/logarithmicbins) generates a list of exponentially separated floats.
|
||||
|
||||
```js
|
||||
logarithmicBins(start: 1.0, factor: 2.0, count: 10, infinty: true)
|
||||
|
||||
// Generated list: [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, +Inf]
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Generating a histogram with linear bins
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used_percent"
|
||||
)
|
||||
|> histogram(
|
||||
bins: linearBins(
|
||||
start:65.5,
|
||||
width: 0.5,
|
||||
count: 20,
|
||||
infinity:false
|
||||
)
|
||||
)
|
||||
```
|
||||
|
||||
###### Output table
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string le:float _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ---------------------------- ----------------------------
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 65.5 5
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 66 6
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 66.5 8
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 67 9
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 67.5 9
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 68 10
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 68.5 12
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 69 12
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 69.5 15
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 70 23
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 70.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 71 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 71.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 72 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 72.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 73 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 73.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 74 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 74.5 30
|
||||
2018-11-07T22:19:58.423658000Z 2018-11-07T22:24:58.423658000Z used_percent mem Scotts-MacBook-Pro.local 75 30
|
||||
```
|
||||
|
||||
### Generating a histogram with logarithmic bins
|
||||
```js
|
||||
from(bucket:"telegraf/autogen")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used_percent"
|
||||
)
|
||||
|> histogram(
|
||||
bins: logarithmicBins(
|
||||
start:0.5,
|
||||
factor: 2.0,
|
||||
count: 10,
|
||||
infinity:false
|
||||
)
|
||||
)
|
||||
```
|
||||
|
||||
###### Output table
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string le:float _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ---------------------------- ----------------------------
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 0.5 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 1 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 2 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 4 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 8 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 16 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 32 0
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 64 2
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 128 30
|
||||
2018-11-07T22:23:36.860664000Z 2018-11-07T22:28:36.860664000Z used_percent mem Scotts-MacBook-Pro.local 256 30
|
||||
```
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
title: Calculate the increase
|
||||
seotitle: Calculate the increase in Flux
|
||||
list_title: Increase
|
||||
description: >
|
||||
Use the `increase()` function to track increases across multiple columns in a table.
|
||||
This function is especially useful when tracking changes in counter values that
|
||||
wrap over time or periodically reset.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: Increase
|
||||
list_query_example: increase
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/increase/
|
||||
v2: /influxdb/v2.0/query-data/flux/increase/
|
||||
---
|
||||
|
||||
Use the [`increase()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/increase/)
|
||||
to track increases across multiple columns in a table.
|
||||
This function is especially useful when tracking changes in counter values that
|
||||
wrap over time or periodically reset.
|
||||
|
||||
```js
|
||||
data
|
||||
|> increase()
|
||||
```
|
||||
|
||||
`increase()` returns a cumulative sum of **non-negative** differences between rows in a table.
|
||||
For example:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1 |
|
||||
| 2020-01-01T00:02:00Z | 2 |
|
||||
| 2020-01-01T00:03:00Z | 8 |
|
||||
| 2020-01-01T00:04:00Z | 10 |
|
||||
| 2020-01-01T00:05:00Z | 0 |
|
||||
| 2020-01-01T00:06:00Z | 4 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`increase()` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1 |
|
||||
| 2020-01-01T00:03:00Z | 7 |
|
||||
| 2020-01-01T00:04:00Z | 9 |
|
||||
| 2020-01-01T00:05:00Z | 9 |
|
||||
| 2020-01-01T00:06:00Z | 13 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,312 @@
|
|||
---
|
||||
title: Join data with Flux
|
||||
seotitle: Join data in InfluxDB with Flux
|
||||
list_title: Join
|
||||
description: This guide walks through joining data with Flux and outlines how it shapes your data in the process.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Join
|
||||
parent: Query with Flux
|
||||
weight: 10
|
||||
list_query_example: join
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/join/
|
||||
v2: /influxdb/v2.0/query-data/flux/join/
|
||||
---
|
||||
|
||||
The [`join()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/join) merges two or more
|
||||
input streams, whose values are equal on a set of common columns, into a single output stream.
|
||||
Flux allows you to join on any columns common between two data streams and opens the door
|
||||
for operations such as cross-measurement joins and math across measurements.
|
||||
|
||||
To illustrate a join operation, use data captured by Telegraf and and stored in
|
||||
InfluxDB - memory usage and processes.
|
||||
|
||||
In this guide, we'll join two data streams, one representing memory usage and the other representing the
|
||||
total number of running processes, then calculate the average memory usage per running process.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Define stream variables
|
||||
In order to perform a join, you must have two streams of data.
|
||||
Assign a variable to each data stream.
|
||||
|
||||
### Memory used variable
|
||||
Define a `memUsed` variable that filters on the `mem` measurement and the `used` field.
|
||||
This returns the amount of memory (in bytes) used.
|
||||
|
||||
###### memUsed stream definition
|
||||
```js
|
||||
memUsed = from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used"
|
||||
)
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### memUsed data output
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string _time:time _value:int
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ------------------------------ --------------------------
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:00.000000000Z 10956333056
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:10.000000000Z 11014008832
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:20.000000000Z 11373428736
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:30.000000000Z 11001421824
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:40.000000000Z 10985852928
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:50:50.000000000Z 10992279552
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:00.000000000Z 11053568000
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:10.000000000Z 11092242432
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:20.000000000Z 11612774400
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:30.000000000Z 11131961344
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:40.000000000Z 11124805632
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:51:50.000000000Z 11332464640
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:00.000000000Z 11176923136
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:10.000000000Z 11181068288
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:20.000000000Z 11182579712
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:30.000000000Z 11238862848
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:40.000000000Z 11275296768
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:52:50.000000000Z 11225411584
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:00.000000000Z 11252690944
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:10.000000000Z 11227029504
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:20.000000000Z 11201646592
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:30.000000000Z 11227897856
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:40.000000000Z 11330428928
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:53:50.000000000Z 11347976192
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:00.000000000Z 11368271872
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:10.000000000Z 11269623808
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:20.000000000Z 11295637504
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:30.000000000Z 11354423296
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:40.000000000Z 11379687424
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:54:50.000000000Z 11248926720
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z used mem host1.local 2018-11-06T05:55:00.000000000Z 11292524544
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
### Total processes variable
|
||||
Define a `procTotal` variable that filters on the `processes` measurement and the `total` field.
|
||||
This returns the number of running processes.
|
||||
|
||||
###### procTotal stream definition
|
||||
```js
|
||||
procTotal = from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "processes" and
|
||||
r._field == "total"
|
||||
)
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### procTotal data output
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement, host]
|
||||
_start:time _stop:time _field:string _measurement:string host:string _time:time _value:int
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------ ------------------------------ --------------------------
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:00.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:20.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:30.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:40.000000000Z 469
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:50:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:00.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:20.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:30.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:40.000000000Z 469
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:51:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:00.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:20.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:30.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:40.000000000Z 472
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:52:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:00.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:20.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:30.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:40.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:53:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:00.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:10.000000000Z 470
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:20.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:30.000000000Z 473
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:40.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:54:50.000000000Z 471
|
||||
2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z total processes host1.local 2018-11-06T05:55:00.000000000Z 471
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
## Join the two data streams
|
||||
With the two data streams defined, use the `join()` function to join them together.
|
||||
`join()` requires two parameters:
|
||||
|
||||
##### `tables`
|
||||
A map of tables to join with keys by which they will be aliased.
|
||||
In the example below, `mem` is the alias for `memUsed` and `proc` is the alias for `procTotal`.
|
||||
|
||||
##### `on`
|
||||
An array of strings defining the columns on which the tables will be joined.
|
||||
_**Both tables must have all columns specified in this list.**_
|
||||
|
||||
```js
|
||||
join(
|
||||
tables: {mem:memUsed, proc:procTotal},
|
||||
on: ["_time", "_stop", "_start", "host"]
|
||||
)
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### Joined output table
|
||||
```
|
||||
Table: keys: [_field_mem, _field_proc, _measurement_mem, _measurement_proc, _start, _stop, host]
|
||||
_field_mem:string _field_proc:string _measurement_mem:string _measurement_proc:string _start:time _stop:time host:string _time:time _value_mem:int _value_proc:int
|
||||
---------------------- ---------------------- ----------------------- ------------------------ ------------------------------ ------------------------------ ------------------------ ------------------------------ -------------------------- --------------------------
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:00.000000000Z 10956333056 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:10.000000000Z 11014008832 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:20.000000000Z 11373428736 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:30.000000000Z 11001421824 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:40.000000000Z 10985852928 469
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:50.000000000Z 10992279552 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:00.000000000Z 11053568000 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:10.000000000Z 11092242432 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:20.000000000Z 11612774400 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:30.000000000Z 11131961344 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:40.000000000Z 11124805632 469
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:50.000000000Z 11332464640 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:00.000000000Z 11176923136 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:10.000000000Z 11181068288 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:20.000000000Z 11182579712 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:30.000000000Z 11238862848 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:40.000000000Z 11275296768 472
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:50.000000000Z 11225411584 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:00.000000000Z 11252690944 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:10.000000000Z 11227029504 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:20.000000000Z 11201646592 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:30.000000000Z 11227897856 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:40.000000000Z 11330428928 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:50.000000000Z 11347976192 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:00.000000000Z 11368271872 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:10.000000000Z 11269623808 470
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:20.000000000Z 11295637504 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:30.000000000Z 11354423296 473
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:40.000000000Z 11379687424 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:50.000000000Z 11248926720 471
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:55:00.000000000Z 11292524544 471
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
Notice the output table includes the following columns:
|
||||
|
||||
- `_field_mem`
|
||||
- `_field_proc`
|
||||
- `_measurement_mem`
|
||||
- `_measurement_proc`
|
||||
- `_value_mem`
|
||||
- `_value_proc`
|
||||
|
||||
These represent the columns with values unique to the two input tables.
|
||||
|
||||
## Calculate and create a new table
|
||||
With the two streams of data joined into a single table, use the
|
||||
[`map()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map)
|
||||
to build a new table by mapping the existing `_time` column to a new `_time`
|
||||
column and dividing `_value_mem` by `_value_proc` and mapping it to a
|
||||
new `_value` column.
|
||||
|
||||
```js
|
||||
join(tables: {mem:memUsed, proc:procTotal}, on: ["_time", "_stop", "_start", "host"])
|
||||
|> map(fn: (r) => ({
|
||||
_time: r._time,
|
||||
_value: r._value_mem / r._value_proc
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### Mapped table
|
||||
```
|
||||
Table: keys: [_field_mem, _field_proc, _measurement_mem, _measurement_proc, _start, _stop, host]
|
||||
_field_mem:string _field_proc:string _measurement_mem:string _measurement_proc:string _start:time _stop:time host:string _time:time _value:int
|
||||
---------------------- ---------------------- ----------------------- ------------------------ ------------------------------ ------------------------------ ------------------------ ------------------------------ --------------------------
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:00.000000000Z 23311346
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:10.000000000Z 23434061
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:20.000000000Z 24147407
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:30.000000000Z 23407280
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:40.000000000Z 23423993
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:50:50.000000000Z 23338173
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:00.000000000Z 23518229
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:10.000000000Z 23600515
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:20.000000000Z 24708030
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:30.000000000Z 23685024
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:40.000000000Z 23720267
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:51:50.000000000Z 24060434
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:00.000000000Z 23730197
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:10.000000000Z 23789506
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:20.000000000Z 23792722
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:30.000000000Z 23861704
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:40.000000000Z 23888340
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:52:50.000000000Z 23833145
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:00.000000000Z 23941895
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:10.000000000Z 23887296
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:20.000000000Z 23833290
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:30.000000000Z 23838424
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:40.000000000Z 24056112
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:53:50.000000000Z 24093367
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:00.000000000Z 24136458
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:10.000000000Z 23977922
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:20.000000000Z 23982245
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:30.000000000Z 24005123
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:40.000000000Z 24160695
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:54:50.000000000Z 23883071
|
||||
used total mem processes 2018-11-06T05:50:00.000000000Z 2018-11-06T05:55:00.000000000Z Scotts-MacBook-Pro.local 2018-11-06T05:55:00.000000000Z 23975635
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
This table represents the average amount of memory in bytes per running process.
|
||||
|
||||
|
||||
## Real world example
|
||||
The following function calculates the batch sizes written to an InfluxDB cluster by joining
|
||||
fields from `httpd` and `write` measurements in order to compare `pointReq` and `writeReq`.
|
||||
The results are grouped by cluster ID so you can make comparisons across clusters.
|
||||
|
||||
```js
|
||||
batchSize = (cluster_id, start=-1m, interval=10s) => {
|
||||
httpd = from(bucket:"telegraf")
|
||||
|> range(start:start)
|
||||
|> filter(fn:(r) =>
|
||||
r._measurement == "influxdb_httpd" and
|
||||
r._field == "writeReq" and
|
||||
r.cluster_id == cluster_id
|
||||
)
|
||||
|> aggregateWindow(every: interval, fn: mean)
|
||||
|> derivative(nonNegative:true,unit:60s)
|
||||
|
||||
write = from(bucket:"telegraf")
|
||||
|> range(start:start)
|
||||
|> filter(fn:(r) =>
|
||||
r._measurement == "influxdb_write" and
|
||||
r._field == "pointReq" and
|
||||
r.cluster_id == cluster_id
|
||||
)
|
||||
|> aggregateWindow(every: interval, fn: max)
|
||||
|> derivative(nonNegative:true,unit:60s)
|
||||
|
||||
return join(
|
||||
tables:{httpd:httpd, write:write},
|
||||
on:["_time","_stop","_start","host"]
|
||||
)
|
||||
|> map(fn:(r) => ({
|
||||
_time: r._time,
|
||||
_value: r._value_httpd / r._value_write,
|
||||
}))
|
||||
|> group(columns: cluster_id)
|
||||
}
|
||||
|
||||
batchSize(cluster_id: "enter cluster id here")
|
||||
```
|
|
@ -0,0 +1,189 @@
|
|||
---
|
||||
title: Manipulate timestamps with Flux
|
||||
list_title: Manipulate timestamps
|
||||
description: >
|
||||
Use Flux to process and manipulate timestamps.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Manipulate timestamps
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/manipulate-timestamps/
|
||||
v2: /influxdb/v2.0/query-data/flux/manipulate-timestamps/
|
||||
---
|
||||
|
||||
Every point stored in InfluxDB has an associated timestamp.
|
||||
Use Flux to process and manipulate timestamps to suit your needs.
|
||||
|
||||
- [Convert timestamp format](#convert-timestamp-format)
|
||||
- [Calculate the duration between two timestamps](#calculate-the-duration-between-two-timestamps)
|
||||
- [Retrieve the current time](#retrieve-the-current-time)
|
||||
- [Normalize irregular timestamps](#normalize-irregular-timestamps)
|
||||
- [Use timestamps and durations together](#use-timestamps-and-durations-together)
|
||||
|
||||
{{% note %}}
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
{{% /note %}}
|
||||
|
||||
|
||||
## Convert timestamp format
|
||||
|
||||
- [Unix nanosecond to RFC3339](#unix-nanosecond-to-rfc3339)
|
||||
- [RFC3339 to Unix nanosecond](#rfc3339-to-unix-nanosecond)
|
||||
|
||||
### Unix nanosecond to RFC3339
|
||||
Use the [`time()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/type-conversions/time/)
|
||||
to convert a [Unix **nanosecond** timestamp](/{{< latest "influxdb" "v2" >}}/reference/glossary/#unix-timestamp)
|
||||
to an [RFC3339 timestamp](/{{< latest "influxdb" "v2" >}}/reference/glossary/#rfc3339-timestamp).
|
||||
|
||||
```js
|
||||
time(v: 1568808000000000000)
|
||||
// Returns 2019-09-18T12:00:00.000000000Z
|
||||
```
|
||||
|
||||
### RFC3339 to Unix nanosecond
|
||||
Use the [`uint()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/type-conversions/uint/)
|
||||
to convert an RFC3339 timestamp to a Unix nanosecond timestamp.
|
||||
|
||||
```js
|
||||
uint(v: 2019-09-18T12:00:00.000000000Z)
|
||||
// Returns 1568808000000000000
|
||||
```
|
||||
|
||||
## Calculate the duration between two timestamps
|
||||
Flux doesn't support mathematical operations using [time type](/{{< latest "influxdb" "v2" >}}/reference/flux/language/types/#time-types) values.
|
||||
To calculate the duration between two timestamps:
|
||||
|
||||
1. Use the `uint()` function to convert each timestamp to a Unix nanosecond timestamp.
|
||||
2. Subtract one Unix nanosecond timestamp from the other.
|
||||
3. Use the `duration()` function to convert the result into a duration.
|
||||
|
||||
```js
|
||||
time1 = uint(v: 2019-09-17T21:12:05Z)
|
||||
time2 = uint(v: 2019-09-18T22:16:35Z)
|
||||
|
||||
duration(v: time2 - time1)
|
||||
// Returns 25h4m30s
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Flux doesn't support duration column types.
|
||||
To store a duration in a column, use the [`string()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/type-conversions/string/)
|
||||
to convert the duration to a string.
|
||||
{{% /note %}}
|
||||
|
||||
## Retrieve the current time
|
||||
- [Current UTC time](#current-utc-time)
|
||||
- [Current system time](#current-system-time)
|
||||
|
||||
### Current UTC time
|
||||
Use the [`now()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/misc/now/) to
|
||||
return the current UTC time in RFC3339 format.
|
||||
|
||||
```js
|
||||
now()
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
`now()` is cached at runtime, so all instances of `now()` in a Flux script
|
||||
return the same value.
|
||||
{{% /note %}}
|
||||
|
||||
### Current system time
|
||||
Import the `system` package and use the [`system.time()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/system/time/)
|
||||
to return the current system time of the host machine in RFC3339 format.
|
||||
|
||||
```js
|
||||
import "system"
|
||||
|
||||
system.time()
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
`system.time()` returns the time it is executed, so each instance of `system.time()`
|
||||
in a Flux script returns a unique value.
|
||||
{{% /note %}}
|
||||
|
||||
## Normalize irregular timestamps
|
||||
To normalize irregular timestamps, truncate all `_time` values to a specified unit
|
||||
with the [`truncateTimeColumn()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/truncatetimecolumn/).
|
||||
This is useful in [`join()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/join/)
|
||||
and [`pivot()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/pivot/)
|
||||
operations where points should align by time, but timestamps vary slightly.
|
||||
|
||||
```js
|
||||
data
|
||||
|> truncateTimeColumn(unit: 1m)
|
||||
```
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:49Z | 2.0 |
|
||||
| 2020-01-01T00:01:01Z | 1.9 |
|
||||
| 2020-01-01T00:03:22Z | 1.8 |
|
||||
| 2020-01-01T00:04:04Z | 1.9 |
|
||||
| 2020-01-01T00:05:38Z | 2.1 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**Output:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 2.0 |
|
||||
| 2020-01-01T00:01:00Z | 1.9 |
|
||||
| 2020-01-01T00:03:00Z | 1.8 |
|
||||
| 2020-01-01T00:04:00Z | 1.9 |
|
||||
| 2020-01-01T00:05:00Z | 2.1 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
## Use timestamps and durations together
|
||||
- [Add a duration to a timestamp](#add-a-duration-to-a-timestamp)
|
||||
- [Subtract a duration from a timestamp](#subtract-a-duration-from-a-timestamp)
|
||||
|
||||
### Add a duration to a timestamp
|
||||
The [`experimental.addDuration()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/addduration/)
|
||||
adds a duration to a specified time and returns the resulting time.
|
||||
|
||||
{{% warn %}}
|
||||
By using `experimental.addDuration()`, you accept the
|
||||
[risks of experimental functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/#use-experimental-functions-at-your-own-risk).
|
||||
{{% /warn %}}
|
||||
|
||||
```js
|
||||
import "experimental"
|
||||
|
||||
experimental.addDuration(
|
||||
d: 6h,
|
||||
to: 2019-09-16T12:00:00Z,
|
||||
)
|
||||
|
||||
// Returns 2019-09-16T18:00:00.000000000Z
|
||||
```
|
||||
|
||||
### Subtract a duration from a timestamp
|
||||
The [`experimental.subDuration()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/subduration/)
|
||||
subtracts a duration from a specified time and returns the resulting time.
|
||||
|
||||
{{% warn %}}
|
||||
By using `experimental.subDuration()`, you accept the
|
||||
[risks of experimental functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/#use-experimental-functions-at-your-own-risk).
|
||||
{{% /warn %}}
|
||||
|
||||
```js
|
||||
import "experimental"
|
||||
|
||||
experimental.subDuration(
|
||||
d: 6h,
|
||||
from: 2019-09-16T12:00:00Z,
|
||||
)
|
||||
|
||||
// Returns 2019-09-16T06:00:00.000000000Z
|
||||
```
|
|
@ -0,0 +1,227 @@
|
|||
---
|
||||
title: Transform data with mathematic operations
|
||||
seotitle: Transform data with mathematic operations in Flux
|
||||
list_title: Transform data with math
|
||||
description: >
|
||||
Use the `map()` function to remap column values and apply mathematic operations.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Transform data with math
|
||||
parent: Query with Flux
|
||||
weight: 5
|
||||
list_query_example: map_math
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/mathematic-operations/
|
||||
v2: /influxdb/v2.0/query-data/flux/mathematic-operations/
|
||||
---
|
||||
|
||||
Flux supports mathematic expressions in data transformations.
|
||||
This article describes how to use [Flux arithmetic operators](/{{< latest "influxdb" "v2" >}}/reference/flux/language/operators/#arithmetic-operators)
|
||||
to "map" over data and transform values using mathematic operations.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
##### Basic mathematic operations
|
||||
```js
|
||||
// Examples executed using the Flux REPL
|
||||
> 9 + 9
|
||||
18
|
||||
> 22 - 14
|
||||
8
|
||||
> 6 * 5
|
||||
30
|
||||
> 21 / 7
|
||||
3
|
||||
```
|
||||
|
||||
<p style="font-size:.85rem;font-style:italic;margin-top:-2rem;">See <a href="/influxdb/v2.0/tools/repl/">Flux read-eval-print-loop (REPL)</a>.</p>
|
||||
|
||||
{{% note %}}
|
||||
#### Operands must be the same type
|
||||
Operands in Flux mathematic operations must be the same data type.
|
||||
For example, integers cannot be used in operations with floats.
|
||||
Otherwise, you will get an error similar to:
|
||||
|
||||
```
|
||||
Error: type error: float != int
|
||||
```
|
||||
|
||||
To convert operands to the same type, use [type-conversion functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/type-conversions/)
|
||||
or manually format operands.
|
||||
The operand data type determines the output data type.
|
||||
For example:
|
||||
|
||||
```js
|
||||
100 // Parsed as an integer
|
||||
100.0 // Parsed as a float
|
||||
|
||||
// Example evaluations
|
||||
> 20 / 8
|
||||
2
|
||||
|
||||
> 20.0 / 8.0
|
||||
2.5
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
## Custom mathematic functions
|
||||
Flux lets you [create custom functions](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions) that use mathematic operations.
|
||||
View the examples below.
|
||||
|
||||
###### Custom multiplication function
|
||||
```js
|
||||
multiply = (x, y) => x * y
|
||||
|
||||
multiply(x: 10, y: 12)
|
||||
// Returns 120
|
||||
```
|
||||
|
||||
###### Custom percentage function
|
||||
```js
|
||||
percent = (sample, total) => (sample / total) * 100.0
|
||||
|
||||
percent(sample: 20.0, total: 80.0)
|
||||
// Returns 25.0
|
||||
```
|
||||
|
||||
### Transform values in a data stream
|
||||
To transform multiple values in an input stream, your function needs to:
|
||||
|
||||
- [Handle piped-forward data](/{{< latest "influxdb" "v2" >}}/query-data/flux/custom-functions/#functions-that-manipulate-piped-forward-data).
|
||||
- Each operand necessary for the calculation exists in each row _(see [Pivot vs join](#pivot-vs-join) below)_.
|
||||
- Use the [`map()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map) to iterate over each row.
|
||||
|
||||
The example `multiplyByX()` function below includes:
|
||||
|
||||
- A `tables` parameter that represents the input data stream (`<-`).
|
||||
- An `x` parameter which is the number by which values in the `_value` column are multiplied.
|
||||
- A `map()` function that iterates over each row in the input stream.
|
||||
It uses the `with` operator to preserve existing columns in each row.
|
||||
It also multiples the `_value` column by `x`.
|
||||
|
||||
```js
|
||||
multiplyByX = (x, tables=<-) =>
|
||||
tables
|
||||
|> map(fn: (r) => ({
|
||||
r with
|
||||
_value: r._value * x
|
||||
})
|
||||
)
|
||||
|
||||
data
|
||||
|> multiplyByX(x: 10)
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Convert bytes to gigabytes
|
||||
To convert active memory from bytes to gigabytes (GB), divide the `active` field
|
||||
in the `mem` measurement by 1,073,741,824.
|
||||
|
||||
The `map()` function iterates over each row in the piped-forward data and defines
|
||||
a new `_value` by dividing the original `_value` by 1073741824.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -10m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "active"
|
||||
)
|
||||
|> map(fn: (r) => ({
|
||||
r with
|
||||
_value: r._value / 1073741824
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
You could turn that same calculation into a function:
|
||||
|
||||
```js
|
||||
bytesToGB = (tables=<-) =>
|
||||
tables
|
||||
|> map(fn: (r) => ({
|
||||
r with
|
||||
_value: r._value / 1073741824
|
||||
})
|
||||
)
|
||||
|
||||
data
|
||||
|> bytesToGB()
|
||||
```
|
||||
|
||||
#### Include partial gigabytes
|
||||
Because the original metric (bytes) is an integer, the output of the operation is an integer and does not include partial GBs.
|
||||
To calculate partial GBs, convert the `_value` column and its values to floats using the
|
||||
[`float()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/type-conversions/float)
|
||||
and format the denominator in the division operation as a float.
|
||||
|
||||
```js
|
||||
bytesToGB = (tables=<-) =>
|
||||
tables
|
||||
|> map(fn: (r) => ({
|
||||
r with
|
||||
_value: float(v: r._value) / 1073741824.0
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
### Calculate a percentage
|
||||
To calculate a percentage, use simple division, then multiply the result by 100.
|
||||
|
||||
```js
|
||||
> 1.0 / 4.0 * 100.0
|
||||
25.0
|
||||
```
|
||||
|
||||
_For an in-depth look at calculating percentages, see [Calculate percentates](/enterprise_influxdb/v1.9/flux/guides/calculate-percentages)._
|
||||
|
||||
## Pivot vs join
|
||||
To query and use values in mathematical operations in Flux, operand values must
|
||||
exists in a single row.
|
||||
Both `pivot()` and `join()` will do this, but there are important differences between the two:
|
||||
|
||||
#### Pivot is more performant
|
||||
`pivot()` reads and operates on a single stream of data.
|
||||
`join()` requires two streams of data and the overhead of reading and combining
|
||||
both streams can be significant, especially for larger data sets.
|
||||
|
||||
#### Use join for multiple data sources
|
||||
Use `join()` when querying data from different buckets or data sources.
|
||||
|
||||
##### Pivot fields into columns for mathematic calculations
|
||||
```js
|
||||
data
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({ r with
|
||||
_value: (r.field1 + r.field2) / r.field3 * 100.0
|
||||
}))
|
||||
```
|
||||
|
||||
##### Join multiple data sources for mathematic calculations
|
||||
```js
|
||||
import "sql"
|
||||
import "influxdata/influxdb/secrets"
|
||||
|
||||
pgUser = secrets.get(key: "POSTGRES_USER")
|
||||
pgPass = secrets.get(key: "POSTGRES_PASSWORD")
|
||||
pgHost = secrets.get(key: "POSTGRES_HOST")
|
||||
|
||||
t1 = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://${pgUser}:${pgPass}@${pgHost}",
|
||||
query:"SELECT id, name, available FROM exampleTable"
|
||||
)
|
||||
|
||||
t2 = from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "example-measurement" and
|
||||
r._field == "example-field"
|
||||
)
|
||||
|
||||
join(tables: {t1: t1, t2: t2}, on: ["id"])
|
||||
|> map(fn: (r) => ({ r with _value: r._value_t2 / r.available_t1 * 100.0 }))
|
||||
```
|
|
@ -0,0 +1,146 @@
|
|||
---
|
||||
title: Find median values
|
||||
seotitle: Find median values in Flux
|
||||
list_title: Median
|
||||
description: >
|
||||
Use the `median()` function to return a value representing the `0.5` quantile (50th percentile) or median of input data.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: Median
|
||||
list_query_example: median
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/median/
|
||||
v2: /influxdb/v2.0/query-data/flux/median/
|
||||
---
|
||||
|
||||
Use the [`median()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/median/)
|
||||
to return a value representing the `0.5` quantile (50th percentile) or median of input data.
|
||||
|
||||
## Select a method for calculating the median
|
||||
Select one of the following methods to calculate the median:
|
||||
|
||||
- [estimate_tdigest](#estimate-tdigest)
|
||||
- [exact_mean](#exact-mean)
|
||||
- [exact_selector](#exact-selector)
|
||||
|
||||
### estimate_tdigest
|
||||
**(Default)** An aggregate method that uses a [t-digest data structure](https://github.com/tdunning/t-digest)
|
||||
to compute an accurate `0.5` quantile estimate on large data sources.
|
||||
Output tables consist of a single row containing the calculated median.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`estimate_tdigest` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_mean
|
||||
An aggregate method that takes the average of the two points closest to the `0.5` quantile value.
|
||||
Output tables consist of a single row containing the calculated median.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_mean` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_selector
|
||||
A selector method that returns the data point for which at least 50% of points are less than.
|
||||
Output tables consist of a single row containing the calculated median.
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_selector` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
{{% note %}}
|
||||
The examples below use the [example data variable](/enterprise_influxdb/v1.9/flux/guides/#example-data-variable).
|
||||
{{% /note %}}
|
||||
|
||||
## Find the value that represents the median
|
||||
Use the default method, `"estimate_tdigest"`, to return all rows in a table that
|
||||
contain values in the 50th percentile of data in the table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> median()
|
||||
```
|
||||
|
||||
## Find the average of values closest to the median
|
||||
Use the `exact_mean` method to return a single row per input table containing the
|
||||
average of the two values closest to the mathematical median of data in the table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> median(method: "exact_mean")
|
||||
```
|
||||
|
||||
## Find the point with the median value
|
||||
Use the `exact_selector` method to return a single row per input table containing the
|
||||
value that 50% of values in the table are less than.
|
||||
|
||||
```js
|
||||
data
|
||||
|> median(method: "exact_selector")
|
||||
```
|
||||
|
||||
## Use median() with aggregateWindow()
|
||||
[`aggregateWindow()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/)
|
||||
segments data into windows of time, aggregates data in each window into a single
|
||||
point, and then removes the time-based segmentation.
|
||||
It is primarily used to downsample data.
|
||||
|
||||
To specify the [median calculation method](#select-a-method-for-calculating-the-median) in `aggregateWindow()`, use the
|
||||
[full function syntax](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/#specify-parameters-of-the-aggregate-function):
|
||||
|
||||
```js
|
||||
data
|
||||
|> aggregateWindow(
|
||||
every: 5m,
|
||||
fn: (tables=<-, column) => tables |> median(method: "exact_selector")
|
||||
)
|
||||
```
|
|
@ -0,0 +1,151 @@
|
|||
---
|
||||
title: Monitor states
|
||||
seotitle: Monitor states and state changes in your events and metrics with Flux.
|
||||
description: Flux provides several functions to help monitor states and state changes in your data.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Monitor states
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/monitor-states/
|
||||
v2: /influxdb/v2.0/query-data/flux/monitor-states/
|
||||
---
|
||||
|
||||
Flux helps you monitor states in your metrics and events:
|
||||
|
||||
- [Find how long a state persists](#find-how-long-a-state-persists)
|
||||
- [Count the number of consecutive states](#count-the-number-of-consecutive-states)
|
||||
- [Detect state changes](#example-query-to-count-machine-state)
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/executing-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Find how long a state persists
|
||||
|
||||
1. Use the [`stateDuration()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/stateduration/) function to calculate how long a column value has remained the same value (or state). Include the following information:
|
||||
|
||||
- **Column to search:** any tag key, tag value, field key, field value, or measurement.
|
||||
- **Value:** the value (or state) to search for in the specified column.
|
||||
- **State duration column:** a new column to store the state duration─the length of time that the specified value persists.
|
||||
- **Unit:** the unit of time (`1s` (by default), `1m`, `1h`) used to increment the state duration.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
|> stateDuration(
|
||||
fn: (r) =>
|
||||
r._column_to_search == "value_to_search_for",
|
||||
column: "state_duration",
|
||||
unit: 1s
|
||||
)
|
||||
```
|
||||
|
||||
2. Use `stateDuration()` to search each point for the specified value:
|
||||
|
||||
- For the first point that evaluates `true`, the state duration is set to `0`. For each consecutive point that evaluates `true`, the state duration increases by the time interval between each consecutive point (in specified units).
|
||||
- If the state is `false`, the state duration is reset to `-1`.
|
||||
|
||||
### Example query with stateDuration()
|
||||
|
||||
The following query searches the `doors` bucket over the past 5 minutes to find how many seconds a door has been `closed`.
|
||||
|
||||
```js
|
||||
from(bucket: "doors")
|
||||
|> range(start: -5m)
|
||||
|> stateDuration(
|
||||
fn: (r) =>
|
||||
r._value == "closed",
|
||||
column: "door_closed",
|
||||
unit: 1s
|
||||
)
|
||||
```
|
||||
|
||||
In this example, `door_closed` is the **State duration** column. If you write data to the `doors` bucket every minute, the state duration increases by `60s` for each consecutive point where `_value` is `closed`. If `_value` is not `closed`, the state duration is reset to `0`.
|
||||
|
||||
#### Query results
|
||||
|
||||
Results for the example query above may look like this (for simplicity, we've omitted the measurement, tag, and field columns):
|
||||
|
||||
```bash
|
||||
_time _value door_closed
|
||||
2019-10-26T17:39:16Z closed 0
|
||||
2019-10-26T17:40:16Z closed 60
|
||||
2019-10-26T17:41:16Z closed 120
|
||||
2019-10-26T17:42:16Z open -1
|
||||
2019-10-26T17:43:16Z closed 0
|
||||
2019-10-26T17:44:27Z closed 60
|
||||
```
|
||||
|
||||
## Count the number of consecutive states
|
||||
|
||||
1. Use the `stateCount()` function and include the following information:
|
||||
|
||||
- **Column to search:** any tag key, tag value, field key, field value, or measurement.
|
||||
- **Value:** to search for in the specified column.
|
||||
- **State count column:** a new column to store the state count─the number of consecutive records in which the specified value exists.
|
||||
|
||||
<!-- -->
|
||||
```js
|
||||
|> stateCount
|
||||
(fn: (r) =>
|
||||
r._column_to_search == "value_to_search_for",
|
||||
column: "state_count"
|
||||
)
|
||||
```
|
||||
|
||||
2. Use `stateCount()` to search each point for the specified value:
|
||||
|
||||
- For the first point that evaluates `true`, the state count is set to `1`. For each consecutive point that evaluates `true`, the state count increases by 1.
|
||||
- If the state is `false`, the state count is reset to `-1`.
|
||||
|
||||
### Example query with stateCount()
|
||||
|
||||
The following query searches the `doors` bucket over the past 5 minutes and
|
||||
calculates how many points have `closed` as their `_value`.
|
||||
|
||||
```js
|
||||
from(bucket: "doors")
|
||||
|> range(start: -5m)
|
||||
|> stateDuration(
|
||||
fn: (r) =>
|
||||
r._value == "closed",
|
||||
column: "door_closed")
|
||||
```
|
||||
|
||||
This example stores the **state count** in the `door_closed` column.
|
||||
If you write data to the `doors` bucket every minute, the state count increases
|
||||
by `1` for each consecutive point where `_value` is `closed`.
|
||||
If `_value` is not `closed`, the state count is reset to `-1`.
|
||||
|
||||
#### Query results
|
||||
|
||||
Results for the example query above may look like this (for simplicity, we've omitted the measurement, tag, and field columns):
|
||||
|
||||
```bash
|
||||
_time _value door_closed
|
||||
2019-10-26T17:39:16Z closed 1
|
||||
2019-10-26T17:40:16Z closed 2
|
||||
2019-10-26T17:41:16Z closed 3
|
||||
2019-10-26T17:42:16Z open -1
|
||||
2019-10-26T17:43:16Z closed 1
|
||||
2019-10-26T17:44:27Z closed 2
|
||||
```
|
||||
|
||||
#### Example query to count machine state
|
||||
|
||||
The following query checks the machine state every minute (idle, assigned, or busy).
|
||||
InfluxDB searches the `servers` bucket over the past hour and counts records with a machine state of `idle`, `assigned` or `busy`.
|
||||
|
||||
```js
|
||||
from(bucket: "servers")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r.machine_state == "idle" or
|
||||
r.machine_state == "assigned" or
|
||||
r.machine_state == "busy"
|
||||
)
|
||||
|> stateCount(fn: (r) => r.machine_state == "busy", column: "_count")
|
||||
|> stateCount(fn: (r) => r.machine_state == "assigned", column: "_count")
|
||||
|> stateCount(fn: (r) => r.machine_state == "idle", column: "_count")
|
||||
```
|
|
@ -0,0 +1,115 @@
|
|||
---
|
||||
title: Calculate the moving average
|
||||
seotitle: Calculate the moving average in Flux
|
||||
list_title: Moving Average
|
||||
description: >
|
||||
Use the `movingAverage()` or `timedMovingAverage()` functions to return the moving average of data.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: Moving Average
|
||||
list_query_example: moving_average
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/moving-average/
|
||||
v2: /influxdb/v2.0/query-data/flux/moving-average/
|
||||
---
|
||||
|
||||
Use the [`movingAverage()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/movingaverage/)
|
||||
or [`timedMovingAverage()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/timedmovingaverage/)
|
||||
functions to return the moving average of data.
|
||||
|
||||
```js
|
||||
data
|
||||
|> movingAverage(n: 5)
|
||||
|
||||
// OR
|
||||
|
||||
data
|
||||
|> timedMovingAverage(every: 5m, period: 10m)
|
||||
```
|
||||
|
||||
### movingAverage()
|
||||
For each row in a table, `movingAverage()` returns the average of the current value and
|
||||
**previous** values where `n` is the total number of values used to calculate the average.
|
||||
|
||||
If `n = 3`:
|
||||
|
||||
| Row # | Calculation |
|
||||
|:-----:|:----------- |
|
||||
| 1 | _Insufficient number of rows_ |
|
||||
| 2 | _Insufficient number of rows_ |
|
||||
| 3 | (Row1 + Row2 + Row3) / 3 |
|
||||
| 4 | (Row2 + Row3 + Row4) / 3 |
|
||||
| 5 | (Row3 + Row4 + Row5) / 3 |
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.2 |
|
||||
| 2020-01-01T00:03:00Z | 1.8 |
|
||||
| 2020-01-01T00:04:00Z | 0.9 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
| 2020-01-01T00:06:00Z | 2.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following would return:**
|
||||
|
||||
```js
|
||||
|> movingAverage(n: 3)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:03:00Z | 1.33 |
|
||||
| 2020-01-01T00:04:00Z | 1.30 |
|
||||
| 2020-01-01T00:05:00Z | 1.36 |
|
||||
| 2020-01-01T00:06:00Z | 1.43 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### timedMovingAverage()
|
||||
For each row in a table, `timedMovingAverage()` returns the average of the
|
||||
current value and all row values in the **previous** `period` (duration).
|
||||
It returns moving averages at a frequency defined by the `every` parameter.
|
||||
|
||||
Each color in the diagram below represents a period of time used to calculate an
|
||||
average and the time a point representing the average is returned.
|
||||
If `every = 30m` and `period = 1h`:
|
||||
|
||||
{{< svg "/static/svgs/timed-moving-avg.svg" >}}
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.2 |
|
||||
| 2020-01-01T00:03:00Z | 1.8 |
|
||||
| 2020-01-01T00:04:00Z | 0.9 |
|
||||
| 2020-01-01T00:05:00Z | 1.4 |
|
||||
| 2020-01-01T00:06:00Z | 2.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following would return:**
|
||||
|
||||
```js
|
||||
|> timedMovingAverage(
|
||||
every: 2m,
|
||||
period: 4m
|
||||
)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1.000 |
|
||||
| 2020-01-01T00:04:00Z | 1.333 |
|
||||
| 2020-01-01T00:06:00Z | 1.325 |
|
||||
| 2020-01-01T00:06:00Z | 1.150 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
|
@ -0,0 +1,163 @@
|
|||
---
|
||||
title: Find percentile and quantile values
|
||||
seotitle: Query percentile and quantile values in Flux
|
||||
list_title: Percentile & quantile
|
||||
description: >
|
||||
Use the `quantile()` function to return all values within the `q` quantile or
|
||||
percentile of input data.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: Percentile & quantile
|
||||
list_query_example: quantile
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/percentile-quantile/
|
||||
v2: /influxdb/v2.0/query-data/flux/percentile-quantile/
|
||||
---
|
||||
|
||||
Use the [`quantile()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/quantile/)
|
||||
to return a value representing the `q` quantile or percentile of input data.
|
||||
|
||||
## Percentile versus quantile
|
||||
Percentiles and quantiles are very similar, differing only in the number used to calculate return values.
|
||||
A percentile is calculated using numbers between `0` and `100`.
|
||||
A quantile is calculated using numbers between `0.0` and `1.0`.
|
||||
For example, the **`0.5` quantile** is the same as the **50th percentile**.
|
||||
|
||||
## Select a method for calculating the quantile
|
||||
Select one of the following methods to calculate the quantile:
|
||||
|
||||
- [estimate_tdigest](#estimate-tdigest)
|
||||
- [exact_mean](#exact-mean)
|
||||
- [exact_selector](#exact-selector)
|
||||
|
||||
### estimate_tdigest
|
||||
**(Default)** An aggregate method that uses a [t-digest data structure](https://github.com/tdunning/t-digest)
|
||||
to compute a quantile estimate on large data sources.
|
||||
Output tables consist of a single row containing the calculated quantile.
|
||||
|
||||
If calculating the `0.5` quantile or 50th percentile:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`estimate_tdigest` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_mean
|
||||
An aggregate method that takes the average of the two points closest to the quantile value.
|
||||
Output tables consist of a single row containing the calculated quantile.
|
||||
|
||||
If calculating the `0.5` quantile or 50th percentile:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_mean` returns:**
|
||||
|
||||
| _value |
|
||||
|:------:|
|
||||
| 1.5 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
### exact_selector
|
||||
A selector method that returns the data point for which at least `q` points are less than.
|
||||
Output tables consist of a single row containing the calculated quantile.
|
||||
|
||||
If calculating the `0.5` quantile or 50th percentile:
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input table:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:01:00Z | 1.0 |
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
| 2020-01-01T00:03:00Z | 2.0 |
|
||||
| 2020-01-01T00:04:00Z | 3.0 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`exact_selector` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:02:00Z | 1.0 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
{{% note %}}
|
||||
The examples below use the [example data variable](/enterprise_influxdb/v1.9/flux/guides/#example-data-variable).
|
||||
{{% /note %}}
|
||||
|
||||
## Find the value representing the 99th percentile
|
||||
Use the default method, `"estimate_tdigest"`, to return all rows in a table that
|
||||
contain values in the 99th percentile of data in the table.
|
||||
|
||||
```js
|
||||
data
|
||||
|> quantile(q: 0.99)
|
||||
```
|
||||
|
||||
## Find the average of values closest to the quantile
|
||||
Use the `exact_mean` method to return a single row per input table containing the
|
||||
average of the two values closest to the mathematical quantile of data in the table.
|
||||
For example, to calculate the `0.99` quantile:
|
||||
|
||||
```js
|
||||
data
|
||||
|> quantile(q: 0.99, method: "exact_mean")
|
||||
```
|
||||
|
||||
## Find the point with the quantile value
|
||||
Use the `exact_selector` method to return a single row per input table containing the
|
||||
value that `q * 100`% of values in the table are less than.
|
||||
For example, to calculate the `0.99` quantile:
|
||||
|
||||
```js
|
||||
data
|
||||
|> quantile(q: 0.99, method: "exact_selector")
|
||||
```
|
||||
|
||||
## Use quantile() with aggregateWindow()
|
||||
[`aggregateWindow()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/)
|
||||
segments data into windows of time, aggregates data in each window into a single
|
||||
point, and then removes the time-based segmentation.
|
||||
It is primarily used to downsample data.
|
||||
|
||||
To specify the [quantile calculation method](#select-a-method-for-calculating-the-quantile) in
|
||||
`aggregateWindow()`, use the [full function syntax](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow/#specify-parameters-of-the-aggregate-function):
|
||||
|
||||
```js
|
||||
data
|
||||
|> aggregateWindow(
|
||||
every: 5m,
|
||||
fn: (tables=<-, column) =>
|
||||
tables
|
||||
|> quantile(q: 0.99, method: "exact_selector")
|
||||
)
|
||||
```
|
|
@ -0,0 +1,78 @@
|
|||
---
|
||||
title: Query fields and tags
|
||||
seotitle: Query fields and tags in InfluxDB using Flux
|
||||
description: >
|
||||
Use the `filter()` function to query data based on fields, tags, or any other column value.
|
||||
`filter()` performs operations similar to the `SELECT` statement and the `WHERE`
|
||||
clause in InfluxQL and other SQL-like query languages.
|
||||
weight: 1
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/query-fields/
|
||||
v2: /influxdb/v2.0/query-data/flux/query-fields/
|
||||
list_code_example: |
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "example-measurement" and
|
||||
r._field == "example-field" and
|
||||
r.tag == "example-tag"
|
||||
)
|
||||
```
|
||||
---
|
||||
|
||||
Use the [`filter()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/filter/)
|
||||
to query data based on fields, tags, or any other column value.
|
||||
`filter()` performs operations similar to the `SELECT` statement and the `WHERE`
|
||||
clause in InfluxQL and other SQL-like query languages.
|
||||
|
||||
## The filter() function
|
||||
`filter()` has an `fn` parameter that expects a **predicate function**,
|
||||
an anonymous function comprised of one or more **predicate expressions**.
|
||||
The predicate function evaluates each input row.
|
||||
Rows that evaluate to `true` are **included** in the output data.
|
||||
Rows that evaluate to `false` are **excluded** from the output data.
|
||||
|
||||
```js
|
||||
// ...
|
||||
|> filter(fn: (r) => r._measurement == "example-measurement" )
|
||||
```
|
||||
|
||||
The `fn` predicate function requires an `r` argument, which represents each row
|
||||
as `filter()` iterates over input data.
|
||||
Key-value pairs in the row record represent columns and their values.
|
||||
Use **dot notation** or **bracket notation** to reference specific column values in the predicate function.
|
||||
Use [logical operators](/{{< latest "influxdb" "v2" >}}/reference/flux/language/operators/#logical-operators)
|
||||
to chain multiple predicate expressions together.
|
||||
|
||||
```js
|
||||
// Row record
|
||||
r = {foo: "bar", baz: "quz"}
|
||||
|
||||
// Example predicate function
|
||||
(r) => r.foo == "bar" and r["baz"] == "quz"
|
||||
|
||||
// Evaluation results
|
||||
(r) => true and true
|
||||
```
|
||||
|
||||
## Filter by fields and tags
|
||||
The combination of [`from()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/inputs/from),
|
||||
[`range()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/range),
|
||||
and `filter()` represent the most basic Flux query:
|
||||
|
||||
1. Use `from()` to define your [bucket](/enterprise_influxdb/v1.9/flux/get-started/#buckets).
|
||||
2. Use `range()` to limit query results by time.
|
||||
3. Use `filter()` to identify what rows of data to output.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "example-measurement" and
|
||||
r._field == "example-field" and
|
||||
r.tag == "example-tag"
|
||||
)
|
||||
```
|
|
@ -0,0 +1,175 @@
|
|||
---
|
||||
title: Calculate the rate of change
|
||||
seotitle: Calculate the rate of change in Flux
|
||||
list_title: Rate
|
||||
description: >
|
||||
Use the `derivative()` function to calculate the rate of change between subsequent values or the
|
||||
`aggregate.rate()` function to calculate the average rate of change per window of time.
|
||||
If time between points varies, these functions normalize points to a common time interval
|
||||
making values easily comparable.
|
||||
weight: 10
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
name: Rate
|
||||
list_query_example: rate_of_change
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/rate/
|
||||
v2: /influxdb/v2.0/query-data/flux/rate/
|
||||
---
|
||||
|
||||
|
||||
Use the [`derivative()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/derivative/)
|
||||
to calculate the rate of change between subsequent values or the
|
||||
[`aggregate.rate()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/aggregate/rate/)
|
||||
to calculate the average rate of change per window of time.
|
||||
If time between points varies, these functions normalize points to a common time interval
|
||||
making values easily comparable.
|
||||
|
||||
- [Rate of change between subsequent values](#rate-of-change-between-subsequent-values)
|
||||
- [Average rate of change per window of time](#average-rate-of-change-per-window-of-time)
|
||||
|
||||
## Rate of change between subsequent values
|
||||
Use the [`derivative()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/derivative/)
|
||||
to calculate the rate of change per unit of time between subsequent _non-null_ values.
|
||||
|
||||
```js
|
||||
data
|
||||
|> derivative(unit: 1s)
|
||||
```
|
||||
|
||||
By default, `derivative()` returns only positive derivative values and replaces negative values with _null_.
|
||||
Cacluated values are returned as [floats](/{{< latest "influxdb" "v2" >}}/reference/flux/language/types/#numeric-types).
|
||||
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 250 |
|
||||
| 2020-01-01T00:04:00Z | 160 |
|
||||
| 2020-01-01T00:12:00Z | 150 |
|
||||
| 2020-01-01T00:19:00Z | 220 |
|
||||
| 2020-01-01T00:32:00Z | 200 |
|
||||
| 2020-01-01T00:51:00Z | 290 |
|
||||
| 2020-01-01T01:00:00Z | 340 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**`derivative(unit: 1m)` returns:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:04:00Z | |
|
||||
| 2020-01-01T00:12:00Z | |
|
||||
| 2020-01-01T00:19:00Z | 10.0 |
|
||||
| 2020-01-01T00:32:00Z | |
|
||||
| 2020-01-01T00:51:00Z | 4.74 |
|
||||
| 2020-01-01T01:00:00Z | 5.56 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
Results represent the rate of change **per minute** between subsequent values with
|
||||
negative values set to _null_.
|
||||
|
||||
### Return negative derivative values
|
||||
To return negative derivative values, set the `nonNegative` parameter to `false`,
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 250 |
|
||||
| 2020-01-01T00:04:00Z | 160 |
|
||||
| 2020-01-01T00:12:00Z | 150 |
|
||||
| 2020-01-01T00:19:00Z | 220 |
|
||||
| 2020-01-01T00:32:00Z | 200 |
|
||||
| 2020-01-01T00:51:00Z | 290 |
|
||||
| 2020-01-01T01:00:00Z | 340 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following returns:**
|
||||
|
||||
```js
|
||||
|> derivative(
|
||||
unit: 1m,
|
||||
nonNegative: false
|
||||
)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:04:00Z | -22.5 |
|
||||
| 2020-01-01T00:12:00Z | -1.25 |
|
||||
| 2020-01-01T00:19:00Z | 10.0 |
|
||||
| 2020-01-01T00:32:00Z | -1.54 |
|
||||
| 2020-01-01T00:51:00Z | 4.74 |
|
||||
| 2020-01-01T01:00:00Z | 5.56 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
Results represent the rate of change **per minute** between subsequent values and
|
||||
include negative values.
|
||||
|
||||
## Average rate of change per window of time
|
||||
|
||||
Use the [`aggregate.rate()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/experimental/aggregate/rate/)
|
||||
to calculate the average rate of change per window of time.
|
||||
|
||||
```js
|
||||
import "experimental/aggregate"
|
||||
|
||||
data
|
||||
|> aggregate.rate(
|
||||
every: 1m,
|
||||
unit: 1s,
|
||||
groupColumns: ["tag1", "tag2"]
|
||||
)
|
||||
```
|
||||
|
||||
`aggregate.rate()` returns the average rate of change (as a [float](/{{< latest "influxdb" "v2" >}}/reference/flux/language/types/#numeric-types))
|
||||
per `unit` for time intervals defined by `every`.
|
||||
Negative values are replaced with _null_.
|
||||
|
||||
{{% note %}}
|
||||
`aggregate.rate()` does not support `nonNegative: false`.
|
||||
{{% /note %}}
|
||||
|
||||
{{< flex >}}
|
||||
{{% flex-content %}}
|
||||
**Given the following input:**
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:00:00Z | 250 |
|
||||
| 2020-01-01T00:04:00Z | 160 |
|
||||
| 2020-01-01T00:12:00Z | 150 |
|
||||
| 2020-01-01T00:19:00Z | 220 |
|
||||
| 2020-01-01T00:32:00Z | 200 |
|
||||
| 2020-01-01T00:51:00Z | 290 |
|
||||
| 2020-01-01T01:00:00Z | 340 |
|
||||
{{% /flex-content %}}
|
||||
{{% flex-content %}}
|
||||
**The following returns:**
|
||||
|
||||
```js
|
||||
|> aggregate.rate(
|
||||
every: 20m,
|
||||
unit: 1m
|
||||
)
|
||||
```
|
||||
|
||||
| _time | _value |
|
||||
|:----- | ------:|
|
||||
| 2020-01-01T00:20:00Z | |
|
||||
| 2020-01-01T00:40:00Z | 10.0 |
|
||||
| 2020-01-01T01:00:00Z | 4.74 |
|
||||
| 2020-01-01T01:20:00Z | 5.56 |
|
||||
{{% /flex-content %}}
|
||||
{{< /flex >}}
|
||||
|
||||
Results represent the **average change rate per minute** of every **20 minute interval**
|
||||
with negative values set to _null_.
|
||||
Timestamps represent the right bound of the time window used to average values.
|
|
@ -0,0 +1,93 @@
|
|||
---
|
||||
title: Use regular expressions in Flux
|
||||
list_title: Regular expressions
|
||||
description: This guide walks through using regular expressions in evaluation logic in Flux functions.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Regular expressions
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
list_query_example: regular_expressions
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/regular-expressions/
|
||||
v2: /influxdb/v2.0/query-data/flux/regular-expressions/
|
||||
---
|
||||
|
||||
Regular expressions (regexes) are incredibly powerful when matching patterns in large collections of data.
|
||||
With Flux, regular expressions are primarily used for evaluation logic in predicate functions for things
|
||||
such as filtering rows, dropping and keeping columns, state detection, etc.
|
||||
This guide shows how to use regular expressions in your Flux scripts.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
## Go regular expression syntax
|
||||
Flux uses Go's [regexp package](https://golang.org/pkg/regexp/) for regular expression search.
|
||||
The links [below](#helpful-links) provide information about Go's regular expression syntax.
|
||||
|
||||
## Regular expression operators
|
||||
Flux provides two comparison operators for use with regular expressions.
|
||||
|
||||
#### `=~`
|
||||
When the expression on the left **MATCHES** the regular expression on the right, this evaluates to `true`.
|
||||
|
||||
#### `!~`
|
||||
When the expression on the left **DOES NOT MATCH** the regular expression on the right, this evaluates to `true`.
|
||||
|
||||
## Regular expressions in Flux
|
||||
When using regex matching in your Flux scripts, enclose your regular expressions with `/`.
|
||||
The following is the basic regex comparison syntax:
|
||||
|
||||
###### Basic regex comparison syntax
|
||||
```js
|
||||
expression =~ /regex/
|
||||
expression !~ /regex/
|
||||
```
|
||||
## Examples
|
||||
|
||||
### Use a regex to filter by tag value
|
||||
The following example filters records by the `cpu` tag.
|
||||
It only keeps records for which the `cpu` is either `cpu0`, `cpu1`, or `cpu2`.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "cpu" and
|
||||
r._field == "usage_user" and
|
||||
r.cpu =~ /cpu[0-2]/
|
||||
)
|
||||
```
|
||||
|
||||
### Use a regex to filter by field key
|
||||
The following example excludes records that do not have `_percent` in a field key.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field =~ /_percent/
|
||||
)
|
||||
```
|
||||
|
||||
### Drop columns matching a regex
|
||||
The following example drops columns whose names do not being with `_`.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "mem")
|
||||
|> drop(fn: (column) => column !~ /_.*/)
|
||||
```
|
||||
|
||||
## Helpful links
|
||||
|
||||
##### Syntax documentation
|
||||
[regexp Syntax GoDoc](https://godoc.org/regexp/syntax)
|
||||
[RE2 Syntax Overview](https://github.com/google/re2/wiki/Syntax)
|
||||
|
||||
##### Go regex testers
|
||||
[Regex Tester - Golang](https://regex-golang.appspot.com/assets/html/index.html)
|
||||
[Regex101](https://regex101.com/)
|
|
@ -0,0 +1,262 @@
|
|||
---
|
||||
title: Extract scalar values in Flux
|
||||
list_title: Extract scalar values
|
||||
description: >
|
||||
Use Flux stream and table functions to extract scalar values from Flux query output.
|
||||
This lets you, for example, dynamically set variables using query results.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Extract scalar values
|
||||
parent: Query with Flux
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/scalar-values/
|
||||
v2: /influxdb/v2.0/query-data/flux/scalar-values/
|
||||
list_code_example: |
|
||||
```js
|
||||
scalarValue = {
|
||||
_record =
|
||||
data
|
||||
|> tableFind(fn: key => true)
|
||||
|> getRecord(idx: 0)
|
||||
return _record._value
|
||||
}
|
||||
```
|
||||
---
|
||||
|
||||
Use Flux [stream and table functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/stream-table/)
|
||||
to extract scalar values from Flux query output.
|
||||
This lets you, for example, dynamically set variables using query results.
|
||||
|
||||
**To extract scalar values from output:**
|
||||
|
||||
1. [Extract a table](#extract-a-table).
|
||||
2. [Extract a column from the table](#extract-a-column-from-the-table)
|
||||
_**or**_ [extract a row from the table](#extract-a-row-from-the-table).
|
||||
|
||||
_The samples on this page use the [sample data provided below](#sample-data)._
|
||||
|
||||
{{% warn %}}
|
||||
#### Current limitations
|
||||
- The InfluxDB user interface (UI) does not currently support raw scalar output.
|
||||
Use [`map()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map/) to add
|
||||
scalar values to output data.
|
||||
- The [Flux REPL](/enterprise_influxdb/v1.9/flux/guides/execute-queries/#influx-cli) does not currently support
|
||||
Flux stream and table functions (also known as "dynamic queries").
|
||||
See [#15321](https://github.com/influxdata/influxdb/issues/15231).
|
||||
{{% /warn %}}
|
||||
|
||||
## Extract a table
|
||||
Flux formats query results as a stream of tables.
|
||||
To extract a scalar value from a stream of tables, you must first extract a single table.
|
||||
|
||||
to extract a single table from the stream of tables.
|
||||
|
||||
{{% note %}}
|
||||
If query results include only one table, it is still formatted as a stream of tables.
|
||||
You still must extract that table from the stream.
|
||||
{{% /note %}}
|
||||
|
||||
Use [`tableFind()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/stream-table/tablefind/)
|
||||
to extract the **first** table whose [group key](/enterprise_influxdb/v1.9/flux/get-started/#group-keys)
|
||||
values match the `fn` **predicate function**.
|
||||
The predicate function requires a `key` record, which represents the group key of
|
||||
each table.
|
||||
|
||||
```js
|
||||
sampleData
|
||||
|> tableFind(fn: (key) =>
|
||||
key._field == "temp" and
|
||||
key.location == "sfo"
|
||||
)
|
||||
```
|
||||
|
||||
The example above returns a single table:
|
||||
|
||||
| _time | location | _field | _value |
|
||||
|:----- |:--------:|:------:| ------:|
|
||||
| 2019-11-01T12:00:00Z | sfo | temp | 65.1 |
|
||||
| 2019-11-01T13:00:00Z | sfo | temp | 66.2 |
|
||||
| 2019-11-01T14:00:00Z | sfo | temp | 66.3 |
|
||||
| 2019-11-01T15:00:00Z | sfo | temp | 66.8 |
|
||||
|
||||
{{% note %}}
|
||||
#### Extract the correct table
|
||||
Flux functions do not guarantee table order and `tableFind()` returns only the
|
||||
**first** table that matches the `fn` predicate.
|
||||
To extract the table that includes the data you actually want, be very specific in
|
||||
your predicate function or filter and transform your data to minimize the number
|
||||
of tables piped-forward into `tableFind()`.
|
||||
{{% /note %}}
|
||||
|
||||
## Extract a column from the table
|
||||
Use the [`getColumn()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/stream-table/getcolumn/)
|
||||
to output an array of values from a specific column in the extracted table.
|
||||
|
||||
|
||||
```js
|
||||
sampleData
|
||||
|> tableFind(fn: (key) =>
|
||||
key._field == "temp" and
|
||||
key.location == "sfo"
|
||||
)
|
||||
|> getColumn(column: "_value")
|
||||
|
||||
// Returns [65.1, 66.2, 66.3, 66.8]
|
||||
```
|
||||
|
||||
### Use extracted column values
|
||||
Use a variable to store the array of values.
|
||||
In the example below, `SFOTemps` represents the array of values.
|
||||
Reference a specific index (integer starting from `0`) in the array to return the
|
||||
value at that index.
|
||||
|
||||
```js
|
||||
SFOTemps = sampleData
|
||||
|> tableFind(fn: (key) =>
|
||||
key._field == "temp" and
|
||||
key.location == "sfo"
|
||||
)
|
||||
|> getColumn(column: "_value")
|
||||
|
||||
SFOTemps
|
||||
// Returns [65.1, 66.2, 66.3, 66.8]
|
||||
|
||||
SFOTemps[0]
|
||||
// Returns 65.1
|
||||
|
||||
SFOTemps[2]
|
||||
// Returns 66.3
|
||||
```
|
||||
|
||||
## Extract a row from the table
|
||||
Use the [`getRecord()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/stream-table/getrecord/)
|
||||
to output data from a single row in the extracted table.
|
||||
Specify the index of the row to output using the `idx` parameter.
|
||||
The function outputs a record with key-value pairs for each column.
|
||||
|
||||
```js
|
||||
sampleData
|
||||
|> tableFind(fn: (key) =>
|
||||
key._field == "temp" and
|
||||
key.location == "sfo"
|
||||
)
|
||||
|> getRecord(idx: 0)
|
||||
|
||||
// Returns {
|
||||
// _time:2019-11-11T12:00:00Z,
|
||||
// _field:"temp",
|
||||
// location:"sfo",
|
||||
// _value: 65.1
|
||||
// }
|
||||
```
|
||||
|
||||
### Use an extracted row record
|
||||
Use a variable to store the extracted row record.
|
||||
In the example below, `tempInfo` represents the extracted row.
|
||||
Use [dot notation](/enterprise_influxdb/v1.9/flux/get-started/syntax-basics/#records) to reference
|
||||
keys in the record.
|
||||
|
||||
```js
|
||||
tempInfo = sampleData
|
||||
|> tableFind(fn: (key) =>
|
||||
key._field == "temp" and
|
||||
key.location == "sfo"
|
||||
)
|
||||
|> getRecord(idx: 0)
|
||||
|
||||
tempInfo
|
||||
// Returns {
|
||||
// _time:2019-11-11T12:00:00Z,
|
||||
// _field:"temp",
|
||||
// location:"sfo",
|
||||
// _value: 65.1
|
||||
// }
|
||||
|
||||
tempInfo._time
|
||||
// Returns 2019-11-11T12:00:00Z
|
||||
|
||||
tempInfo.location
|
||||
// Returns sfo
|
||||
```
|
||||
|
||||
## Example helper functions
|
||||
Create custom helper functions to extract scalar values from query output.
|
||||
|
||||
##### Extract a scalar field value
|
||||
```js
|
||||
// Define a helper function to extract field values
|
||||
getFieldValue = (tables=<-, field) => {
|
||||
extract = tables
|
||||
|> tableFind(fn: (key) => key._field == field)
|
||||
|> getColumn(column: "_value")
|
||||
return extract[0]
|
||||
}
|
||||
|
||||
// Use the helper function to define a variable
|
||||
lastJFKTemp = sampleData
|
||||
|> filter(fn: (r) => r.location == "kjfk")
|
||||
|> last()
|
||||
|> getFieldValue(field: "temp")
|
||||
|
||||
lastJFKTemp
|
||||
// Returns 71.2
|
||||
```
|
||||
|
||||
##### Extract scalar row data
|
||||
```js
|
||||
// Define a helper function to extract a row as a record
|
||||
getRow = (tables=<-, field, idx=0) => {
|
||||
extract = tables
|
||||
|> tableFind(fn: (key) => true)
|
||||
|> getRecord(idx: idx)
|
||||
return extract
|
||||
}
|
||||
|
||||
// Use the helper function to define a variable
|
||||
lastReported = sampleData
|
||||
|> last()
|
||||
|> getRow(idx: 0)
|
||||
|
||||
"The last location to report was ${lastReported.location}.
|
||||
The temperature was ${string(v: lastReported._value)}°F."
|
||||
|
||||
// Returns:
|
||||
// The last location to report was kord.
|
||||
// The temperature was 38.9°F.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sample data
|
||||
|
||||
The following sample data set represents fictional temperature metrics collected
|
||||
from three locations.
|
||||
It's formatted in [annotated CSV](https://v2.docs.influxdata.com/v2.0/reference/syntax/annotated-csv/) and imported
|
||||
into the Flux query using the [`csv.from()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/csv/from/).
|
||||
|
||||
Place the following at the beginning of your query to use the sample data:
|
||||
|
||||
{{% truncate %}}
|
||||
```js
|
||||
import "csv"
|
||||
|
||||
sampleData = csv.from(csv: "
|
||||
#datatype,string,long,dateTime:RFC3339,string,string,double
|
||||
#group,false,true,false,true,true,false
|
||||
#default,,,,,,
|
||||
,result,table,_time,location,_field,_value
|
||||
,,0,2019-11-01T12:00:00Z,sfo,temp,65.1
|
||||
,,0,2019-11-01T13:00:00Z,sfo,temp,66.2
|
||||
,,0,2019-11-01T14:00:00Z,sfo,temp,66.3
|
||||
,,0,2019-11-01T15:00:00Z,sfo,temp,66.8
|
||||
,,1,2019-11-01T12:00:00Z,kjfk,temp,69.4
|
||||
,,1,2019-11-01T13:00:00Z,kjfk,temp,69.9
|
||||
,,1,2019-11-01T14:00:00Z,kjfk,temp,71.0
|
||||
,,1,2019-11-01T15:00:00Z,kjfk,temp,71.2
|
||||
,,2,2019-11-01T12:00:00Z,kord,temp,46.4
|
||||
,,2,2019-11-01T13:00:00Z,kord,temp,46.3
|
||||
,,2,2019-11-01T14:00:00Z,kord,temp,42.7
|
||||
,,2,2019-11-01T15:00:00Z,kord,temp,38.9
|
||||
")
|
||||
```
|
||||
{{% /truncate %}}
|
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
title: Sort and limit data with Flux
|
||||
seotitle: Sort and limit data in InfluxDB with Flux
|
||||
list_title: Sort and limit
|
||||
description: >
|
||||
Use the `sort()`function to order records within each table by specific columns and the
|
||||
`limit()` function to limit the number of records in output tables to a fixed number, `n`.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Sort and limit
|
||||
parent: Query with Flux
|
||||
weight: 3
|
||||
list_query_example: sort_limit
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/sort-limit/
|
||||
v2: /influxdb/v2.0/query-data/flux/sort-limit/
|
||||
---
|
||||
|
||||
Use the [`sort()`function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/sort)
|
||||
to order records within each table by specific columns and the
|
||||
[`limit()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/limit)
|
||||
to limit the number of records in output tables to a fixed number, `n`.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
##### Example sorting system uptime
|
||||
|
||||
The following example orders system uptime first by region, then host, then value.
|
||||
|
||||
```js
|
||||
from(bucket:"db/rp")
|
||||
|> range(start:-12h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "system" and
|
||||
r._field == "uptime"
|
||||
)
|
||||
|> sort(columns:["region", "host", "_value"])
|
||||
```
|
||||
|
||||
The [`limit()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/limit)
|
||||
limits the number of records in output tables to a fixed number, `n`.
|
||||
The following example shows up to 10 records from the past hour.
|
||||
|
||||
```js
|
||||
from(bucket:"db/rp")
|
||||
|> range(start:-1h)
|
||||
|> limit(n:10)
|
||||
```
|
||||
|
||||
You can use `sort()` and `limit()` together to show the top N records.
|
||||
The example below returns the 10 top system uptime values sorted first by
|
||||
region, then host, then value.
|
||||
|
||||
```js
|
||||
from(bucket:"db/rp")
|
||||
|> range(start:-12h)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "system" and
|
||||
r._field == "uptime"
|
||||
)
|
||||
|> sort(columns:["region", "host", "_value"])
|
||||
|> limit(n:10)
|
||||
```
|
||||
|
||||
You now have created a Flux query that sorts and limits data.
|
||||
Flux also provides the [`top()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/top)
|
||||
and [`bottom()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/selectors/bottom)
|
||||
functions to perform both of these functions at the same time.
|
|
@ -0,0 +1,215 @@
|
|||
---
|
||||
title: Query SQL data sources
|
||||
seotitle: Query SQL data sources with InfluxDB
|
||||
list_title: Query SQL data
|
||||
description: >
|
||||
The Flux `sql` package provides functions for working with SQL data sources.
|
||||
Use `sql.from()` to query SQL databases like PostgreSQL and MySQL
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Query with Flux
|
||||
list_title: SQL data
|
||||
weight: 20
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/sql/
|
||||
v2: /influxdb/v2.0/query-data/flux/sql/
|
||||
list_code_example: |
|
||||
```js
|
||||
import "sql"
|
||||
|
||||
sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://user:password@localhost",
|
||||
query: "SELECT * FROM example_table"
|
||||
)
|
||||
```
|
||||
---
|
||||
|
||||
The Flux `sql` package provides functions for working with SQL data sources.
|
||||
[`sql.from()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/sql/from/) lets you query SQL data sources
|
||||
like [PostgreSQL](https://www.postgresql.org/), [MySQL](https://www.mysql.com/),
|
||||
and [SQLite](https://www.sqlite.org/index.html), and use the results with InfluxDB
|
||||
dashboards, tasks, and other operations.
|
||||
|
||||
- [Query a SQL data source](#query-a-sql-data-source)
|
||||
- [Join SQL data with data in InfluxDB](#join-sql-data-with-data-in-influxdb)
|
||||
- [Sample sensor data](#sample-sensor-data)
|
||||
|
||||
## Query a SQL data source
|
||||
To query a SQL data source:
|
||||
|
||||
1. Import the `sql` package in your Flux query
|
||||
2. Use the `sql.from()` function to specify the driver, data source name (DSN),
|
||||
and query used to query data from your SQL data source:
|
||||
|
||||
{{< code-tabs-wrapper >}}
|
||||
{{% code-tabs %}}
|
||||
[PostgreSQL](#)
|
||||
[MySQL](#)
|
||||
[SQLite](#)
|
||||
{{% /code-tabs %}}
|
||||
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
import "sql"
|
||||
|
||||
sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://user:password@localhost",
|
||||
query: "SELECT * FROM example_table"
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
import "sql"
|
||||
|
||||
sql.from(
|
||||
driverName: "mysql",
|
||||
dataSourceName: "user:password@tcp(localhost:3306)/db",
|
||||
query: "SELECT * FROM example_table"
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
|
||||
{{% code-tab-content %}}
|
||||
```js
|
||||
// NOTE: InfluxDB OSS and InfluxDB Cloud do not have access to
|
||||
// the local filesystem and cannot query SQLite data sources.
|
||||
// Use the Flux REPL to query an SQLite data source.
|
||||
|
||||
import "sql"
|
||||
sql.from(
|
||||
driverName: "sqlite3",
|
||||
dataSourceName: "file:/path/to/test.db?cache=shared&mode=ro",
|
||||
query: "SELECT * FROM example_table"
|
||||
)
|
||||
```
|
||||
{{% /code-tab-content %}}
|
||||
{{< /code-tabs-wrapper >}}
|
||||
|
||||
_See the [`sql.from()` documentation](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/sql/from/) for
|
||||
information about required function parameters._
|
||||
|
||||
## Join SQL data with data in InfluxDB
|
||||
One of the primary benefits of querying SQL data sources from InfluxDB
|
||||
is the ability to enrich query results with data stored outside of InfluxDB.
|
||||
|
||||
Using the [air sensor sample data](#sample-sensor-data) below, the following query
|
||||
joins air sensor metrics stored in InfluxDB with sensor information stored in PostgreSQL.
|
||||
The joined data lets you query and filter results based on sensor information
|
||||
that isn't stored in InfluxDB.
|
||||
|
||||
```js
|
||||
// Import the "sql" package
|
||||
import "sql"
|
||||
|
||||
// Query data from PostgreSQL
|
||||
sensorInfo = sql.from(
|
||||
driverName: "postgres",
|
||||
dataSourceName: "postgresql://localhost?sslmode=disable",
|
||||
query: "SELECT * FROM sensors"
|
||||
)
|
||||
|
||||
// Query data from InfluxDB
|
||||
sensorMetrics = from(bucket: "telegraf/autogen")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r._measurement == "airSensors")
|
||||
|
||||
// Join InfluxDB query results with PostgreSQL query results
|
||||
join(tables: {metric: sensorMetrics, info: sensorInfo}, on: ["sensor_id"])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sample sensor data
|
||||
The [sample data generator](#download-and-run-the-sample-data-generator) and
|
||||
[sample sensor information](#import-the-sample-sensor-information) simulate a
|
||||
group of sensors that measure temperature, humidity, and carbon monoxide
|
||||
in rooms throughout a building.
|
||||
Each collected data point is stored in InfluxDB with a `sensor_id` tag that identifies
|
||||
the specific sensor it came from.
|
||||
Sample sensor information is stored in PostgreSQL.
|
||||
|
||||
**Sample data includes:**
|
||||
|
||||
- Simulated data collected from each sensor and stored in the `airSensors` measurement in **InfluxDB**:
|
||||
- temperature
|
||||
- humidity
|
||||
- co
|
||||
|
||||
- Information about each sensor stored in the `sensors` table in **PostgreSQL**:
|
||||
- sensor_id
|
||||
- location
|
||||
- model_number
|
||||
- last_inspected
|
||||
|
||||
### Import and generate sample sensor data
|
||||
|
||||
#### Download and run the sample data generator
|
||||
`air-sensor-data.rb` is a script that generates air sensor data and stores the data in InfluxDB.
|
||||
To use `air-sensor-data.rb`:
|
||||
|
||||
1. [Create a database](/enterprise_influxdb/v1.9/introduction/get-started/#creating-a-database) to store the data.
|
||||
2. Download the sample data generator. _This tool requires [Ruby](https://www.ruby-lang.org/en/)._
|
||||
|
||||
<a class="btn download" style="color:#fff" href="/downloads/air-sensor-data.rb" download>Download Air Sensor Generator</a>
|
||||
|
||||
3. Give `air-sensor-data.rb` executable permissions:
|
||||
|
||||
```
|
||||
chmod +x air-sensor-data.rb
|
||||
```
|
||||
|
||||
4. Start the generator. Specify your database.
|
||||
|
||||
```
|
||||
./air-sensor-data.rb -d database-name
|
||||
```
|
||||
|
||||
The generator begins to write data to InfluxDB and will continue until stopped.
|
||||
Use `ctrl-c` to stop the generator.
|
||||
|
||||
_**Note:** Use the `--help` flag to view other configuration options._
|
||||
|
||||
|
||||
5. Query your target database to ensure the generated data is writing successfully.
|
||||
The generator doesn't catch errors from write requests, so it will continue running
|
||||
even if data is not writing to InfluxDB successfully.
|
||||
|
||||
```
|
||||
from(bucket: "database-name/autogen")
|
||||
|> range(start: -1m)
|
||||
|> filter(fn: (r) => r._measurement == "airSensors")
|
||||
```
|
||||
|
||||
#### Import the sample sensor information
|
||||
1. [Download and install PostgreSQL](https://www.postgresql.org/download/).
|
||||
2. Download the sample sensor information CSV.
|
||||
|
||||
<a class="btn download" style="color:#fff" href="/downloads/sample-sensor-info.csv" download>Download Sample Data</a>
|
||||
|
||||
3. Use a PostgreSQL client (`psql` or a GUI) to create the `sensors` table:
|
||||
|
||||
```
|
||||
CREATE TABLE sensors (
|
||||
sensor_id character varying(50),
|
||||
location character varying(50),
|
||||
model_number character varying(50),
|
||||
last_inspected date
|
||||
);
|
||||
```
|
||||
|
||||
4. Import the downloaded CSV sample data.
|
||||
_Update the `FROM` file path to the path of the downloaded CSV sample data._
|
||||
|
||||
```
|
||||
COPY sensors(sensor_id,location,model_number,last_inspected)
|
||||
FROM '/path/to/sample-sensor-info.csv' DELIMITER ',' CSV HEADER;
|
||||
```
|
||||
|
||||
5. Query the table to ensure the data was imported correctly:
|
||||
|
||||
```
|
||||
SELECT * FROM sensors;
|
||||
```
|
|
@ -0,0 +1,354 @@
|
|||
---
|
||||
title: Window and aggregate data with Flux
|
||||
seotitle: Window and aggregate data in InfluxDB with Flux
|
||||
list_title: Window & aggregate
|
||||
description: >
|
||||
This guide walks through windowing and aggregating data with Flux and outlines
|
||||
how it shapes your data in the process.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Window & aggregate
|
||||
parent: Query with Flux
|
||||
weight: 4
|
||||
list_query_example: aggregate_window
|
||||
canonical: /{{< latest "influxdb" "v2" >}}/query-data/flux/window-aggregate/
|
||||
v2: /influxdb/v2.0/query-data/flux/window-aggregate/
|
||||
---
|
||||
|
||||
A common operation performed with time series data is grouping data into windows of time,
|
||||
or "windowing" data, then aggregating windowed values into a new value.
|
||||
This guide walks through windowing and aggregating data with Flux and demonstrates
|
||||
how data is shaped in the process.
|
||||
|
||||
If you're just getting started with Flux queries, check out the following:
|
||||
|
||||
- [Get started with Flux](/enterprise_influxdb/v1.9/flux/get-started/) for a conceptual overview of Flux and parts of a Flux query.
|
||||
- [Execute queries](/enterprise_influxdb/v1.9/flux/guides/execute-queries/) to discover a variety of ways to run your queries.
|
||||
|
||||
{{% note %}}
|
||||
The following example is an in-depth walk-through of the steps required to window and aggregate data.
|
||||
The [`aggregateWindow()` function](#summing-up) performs these operations for you, but understanding
|
||||
how data is shaped in the process helps to successfully create your desired output.
|
||||
{{% /note %}}
|
||||
|
||||
## Data set
|
||||
For the purposes of this guide, define a variable that represents your base data set.
|
||||
The following example queries the memory usage of the host machine.
|
||||
|
||||
```js
|
||||
dataSet = from(bucket: "db/rp")
|
||||
|> range(start: -5m)
|
||||
|> filter(fn: (r) =>
|
||||
r._measurement == "mem" and
|
||||
r._field == "used_percent"
|
||||
)
|
||||
|> drop(columns: ["host"])
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
This example drops the `host` column from the returned data since the memory data
|
||||
is only tracked for a single host and it simplifies the output tables.
|
||||
Dropping the `host` column is optional and not recommended if monitoring memory
|
||||
on multiple hosts.
|
||||
{{% /note %}}
|
||||
|
||||
`dataSet` can now be used to represent your base data, which will look similar to the following:
|
||||
|
||||
{{% truncate %}}
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:00.000000000Z 71.11611366271973
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:10.000000000Z 67.39630699157715
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:20.000000000Z 64.16666507720947
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:30.000000000Z 64.19951915740967
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:40.000000000Z 64.2122745513916
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:50:50.000000000Z 64.22209739685059
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 64.6336555480957
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:10.000000000Z 64.16516304016113
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:20.000000000Z 64.18349742889404
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:30.000000000Z 64.20474052429199
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:40.000000000Z 68.65062713623047
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:50.000000000Z 67.20139980316162
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 70.9143877029419
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:10.000000000Z 64.14549350738525
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:20.000000000Z 64.15379047393799
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:30.000000000Z 64.1592264175415
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:40.000000000Z 64.18190002441406
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:50.000000000Z 64.28837776184082
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 64.29731845855713
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:10.000000000Z 64.36963081359863
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:20.000000000Z 64.37397003173828
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:30.000000000Z 64.44413661956787
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:40.000000000Z 64.42906856536865
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:50.000000000Z 64.44573402404785
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.48912620544434
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:10.000000000Z 64.49522972106934
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:20.000000000Z 64.48652744293213
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:30.000000000Z 64.49949741363525
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:40.000000000Z 64.4949197769165
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:50.000000000Z 64.49787616729736
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
## Windowing data
|
||||
Use the [`window()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/window)
|
||||
to group your data based on time bounds.
|
||||
The most common parameter passed with the `window()` is `every` which
|
||||
defines the duration of time between windows.
|
||||
Other parameters are available, but for this example, window the base data
|
||||
set into one minute windows.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
The `every` parameter supports all [valid duration units](/{{< latest "influxdb" "v2" >}}/reference/flux/language/types/#duration-types),
|
||||
including **calendar months (`1mo`)** and **years (`1y`)**.
|
||||
{{% /note %}}
|
||||
|
||||
Each window of time is output in its own table containing all records that fall within the window.
|
||||
|
||||
{{% truncate %}}
|
||||
###### window() output tables
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:00.000000000Z 71.11611366271973
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:10.000000000Z 67.39630699157715
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:20.000000000Z 64.16666507720947
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:30.000000000Z 64.19951915740967
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:40.000000000Z 64.2122745513916
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:50:50.000000000Z 64.22209739685059
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 64.6336555480957
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:10.000000000Z 64.16516304016113
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:20.000000000Z 64.18349742889404
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:30.000000000Z 64.20474052429199
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:40.000000000Z 68.65062713623047
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:51:50.000000000Z 67.20139980316162
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 70.9143877029419
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:10.000000000Z 64.14549350738525
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:20.000000000Z 64.15379047393799
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:30.000000000Z 64.1592264175415
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:40.000000000Z 64.18190002441406
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:52:50.000000000Z 64.28837776184082
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 64.29731845855713
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:10.000000000Z 64.36963081359863
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:20.000000000Z 64.37397003173828
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:30.000000000Z 64.44413661956787
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:40.000000000Z 64.42906856536865
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:53:50.000000000Z 64.44573402404785
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.48912620544434
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:10.000000000Z 64.49522972106934
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:20.000000000Z 64.48652744293213
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:30.000000000Z 64.49949741363525
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:40.000000000Z 64.4949197769165
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:50.000000000Z 64.49787616729736
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
When visualized in the InfluxDB UI, each window table is displayed in a different color.
|
||||
|
||||

|
||||
|
||||
## Aggregate data
|
||||
[Aggregate functions](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates) take the values
|
||||
of all rows in a table and use them to perform an aggregate operation.
|
||||
The result is output as a new value in a single-row table.
|
||||
|
||||
Since windowed data is split into separate tables, aggregate operations run against
|
||||
each table separately and output new tables containing only the aggregated value.
|
||||
|
||||
For this example, use the [`mean()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/mean)
|
||||
to output the average of each window:
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
|> mean()
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### mean() output tables
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 65.88549613952637
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 65.50651391347249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 65.30719598134358
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 64.39330975214641
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49386278788249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ----------------------------
|
||||
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
Because each data point is contained in its own table, when visualized,
|
||||
they appear as single, unconnected points.
|
||||
|
||||

|
||||
|
||||
### Recreate the time column
|
||||
**Notice the `_time` column is not in the [aggregated output tables](#mean-output-tables).**
|
||||
Because records in each table are aggregated together, their timestamps no longer
|
||||
apply and the column is removed from the group key and table.
|
||||
|
||||
Also notice the `_start` and `_stop` columns still exist.
|
||||
These represent the lower and upper bounds of the time window.
|
||||
|
||||
Many Flux functions rely on the `_time` column.
|
||||
To further process your data after an aggregate function, you need to re-add `_time`.
|
||||
Use the [`duplicate()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/duplicate) to
|
||||
duplicate either the `_start` or `_stop` column as a new `_time` column.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
```
|
||||
|
||||
{{% truncate %}}
|
||||
###### duplicate() output tables
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:51:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 65.88549613952637
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:51:00.000000000Z 2018-11-03T17:52:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 65.50651391347249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:52:00.000000000Z 2018-11-03T17:53:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 65.30719598134358
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:53:00.000000000Z 2018-11-03T17:54:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.39330975214641
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:54:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49386278788249
|
||||
|
||||
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:55:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
{{% /truncate %}}
|
||||
|
||||
## "Unwindow" aggregate tables
|
||||
Keeping aggregate values in separate tables generally isn't the format in which you want your data.
|
||||
Use the `window()` function to "unwindow" your data into a single infinite (`inf`) window.
|
||||
|
||||
```js
|
||||
dataSet
|
||||
|> window(every: 1m)
|
||||
|> mean()
|
||||
|> duplicate(column: "_stop", as: "_time")
|
||||
|> window(every: inf)
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Windowing requires a `_time` column which is why it's necessary to
|
||||
[recreate the `_time` column](#recreate-the-time-column) after an aggregation.
|
||||
{{% /note %}}
|
||||
|
||||
###### Unwindowed output table
|
||||
```
|
||||
Table: keys: [_start, _stop, _field, _measurement]
|
||||
_start:time _stop:time _field:string _measurement:string _time:time _value:float
|
||||
------------------------------ ------------------------------ ---------------------- ---------------------- ------------------------------ ----------------------------
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:51:00.000000000Z 65.88549613952637
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:52:00.000000000Z 65.50651391347249
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:53:00.000000000Z 65.30719598134358
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:54:00.000000000Z 64.39330975214641
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49386278788249
|
||||
2018-11-03T17:50:00.000000000Z 2018-11-03T17:55:00.000000000Z used_percent mem 2018-11-03T17:55:00.000000000Z 64.49816226959229
|
||||
```
|
||||
|
||||
With the aggregate values in a single table, data points in the visualization are connected.
|
||||
|
||||

|
||||
|
||||
## Summing up
|
||||
You have now created a Flux query that windows and aggregates data.
|
||||
The data transformation process outlined in this guide should be used for all aggregation operations.
|
||||
|
||||
Flux also provides the [`aggregateWindow()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow)
|
||||
which performs all these separate functions for you.
|
||||
|
||||
The following Flux query will return the same results:
|
||||
|
||||
###### aggregateWindow function
|
||||
```js
|
||||
dataSet
|
||||
|> aggregateWindow(every: 1m, fn: mean)
|
||||
```
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
title: Enable Flux
|
||||
description: Instructions for enabling Flux in your InfluxDB configuration.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Enable Flux
|
||||
parent: Flux
|
||||
weight: 1
|
||||
---
|
||||
|
||||
Flux is packaged with **InfluxDB v1.8+** and does not require any additional installation,
|
||||
however it is **disabled by default and needs to be enabled**.
|
||||
|
||||
## Enable Flux
|
||||
Enable Flux by setting the `flux-enabled` option to `true` under the `[http]` section of your `influxdb.conf`:
|
||||
|
||||
###### influxdb.conf
|
||||
```toml
|
||||
# ...
|
||||
|
||||
[http]
|
||||
|
||||
# ...
|
||||
|
||||
flux-enabled = true
|
||||
|
||||
# ...
|
||||
```
|
||||
|
||||
> The default location of your `influxdb.conf` depends on your operating system.
|
||||
> More information is available in the [Configuring InfluxDB](/enterprise_influxdb/v1.9/administration/config/#using-the-configuration-file) guide.
|
||||
|
||||
When InfluxDB starts, the Flux daemon starts as well and data can be queried using Flux.
|
|
@ -0,0 +1,180 @@
|
|||
---
|
||||
title: Optimize Flux queries
|
||||
description: >
|
||||
Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
|
||||
weight: 4
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
parent: Flux
|
||||
canonical: /influxdb/cloud/query-data/optimize-queries/
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/flux/guides/optimize-queries
|
||||
---
|
||||
|
||||
Optimize your Flux queries to reduce their memory and compute (CPU) requirements.
|
||||
|
||||
- [Start queries with pushdowns](#start-queries-with-pushdowns)
|
||||
- [Avoid processing filters inline](#avoid-processing-filters-inline)
|
||||
- [Avoid short window durations](#avoid-short-window-durations)
|
||||
- [Use "heavy" functions sparingly](#use-heavy-functions-sparingly)
|
||||
- [Use set() instead of map() when possible](#use-set-instead-of-map-when-possible)
|
||||
- [Balance time range and data precision](#balance-time-range-and-data-precision)
|
||||
- [Measure query performance with Flux profilers](#measure-query-performance-with-flux-profilers)
|
||||
|
||||
## Start queries with pushdowns
|
||||
**Pushdowns** are functions or function combinations that push data operations to the underlying data source rather than operating on data in memory. Start queries with pushdowns to improve query performance. Once a non-pushdown function runs, Flux pulls data into memory and runs all subsequent operations there.
|
||||
|
||||
#### Pushdown functions and function combinations
|
||||
The following pushdowns are supported in InfluxDB Enterprise 1.9+.
|
||||
|
||||
| Functions | Supported |
|
||||
| :----------------------------- | :------------------: |
|
||||
| **count()** | {{< icon "check" >}} |
|
||||
| **drop()** | {{< icon "check" >}} |
|
||||
| **duplicate()** | {{< icon "check" >}} |
|
||||
| **filter()** {{% req " \*" %}} | {{< icon "check" >}} |
|
||||
| **fill()** | {{< icon "check" >}} |
|
||||
| **first()** | {{< icon "check" >}} |
|
||||
| **group()** | {{< icon "check" >}} |
|
||||
| **keep()** | {{< icon "check" >}} |
|
||||
| **last()** | {{< icon "check" >}} |
|
||||
| **max()** | {{< icon "check" >}} |
|
||||
| **mean()** | {{< icon "check" >}} |
|
||||
| **min()** | {{< icon "check" >}} |
|
||||
| **range()** | {{< icon "check" >}} |
|
||||
| **rename()** | {{< icon "check" >}} |
|
||||
| **sum()** | {{< icon "check" >}} |
|
||||
| **window()** | {{< icon "check" >}} |
|
||||
| _Function combinations_ | |
|
||||
| **window()** \|> **count()** | {{< icon "check" >}} |
|
||||
| **window()** \|> **first()** | {{< icon "check" >}} |
|
||||
| **window()** \|> **last()** | {{< icon "check" >}} |
|
||||
| **window()** \|> **max()** | {{< icon "check" >}} |
|
||||
| **window()** \|> **min()** | {{< icon "check" >}} |
|
||||
| **window()** \|> **sum()** | {{< icon "check" >}} |
|
||||
|
||||
{{% caption %}}
|
||||
{{< req "\*" >}} **filter()** only pushes down when all parameter values are static.
|
||||
See [Avoid processing filters inline](#avoid-processing-filters-inline).
|
||||
{{% /caption %}}
|
||||
|
||||
Use pushdown functions and function combinations at the beginning of your query.
|
||||
Once a non-pushdown function runs, Flux pulls data into memory and runs all
|
||||
subsequent operations there.
|
||||
|
||||
##### Pushdown functions in use
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h) //
|
||||
|> filter(fn: (r) => r.sensor == "abc123") //
|
||||
|> group(columns: ["_field", "host"]) // Pushed to the data source
|
||||
|> aggregateWindow(every: 5m, fn: max) //
|
||||
|> filter(fn: (r) => r._value >= 90.0) //
|
||||
|
||||
|> top(n: 10) // Run in memory
|
||||
```
|
||||
|
||||
### Avoid processing filters inline
|
||||
Avoid using mathematic operations or string manipulation inline to define data filters.
|
||||
Processing filter values inline prevents `filter()` from pushing its operation down
|
||||
to the underlying data source, so data returned by the
|
||||
previous function loads into memory.
|
||||
This often results in a significant performance hit.
|
||||
|
||||
For example, the following query uses [Chronograf dashboard template variables](/{{< latest "chronograf" >}}/guides/dashboard-template-variables/)
|
||||
and string concatenation to define a region to filter by.
|
||||
Because `filter()` uses string concatenation inline, it can't push its operation
|
||||
to the underlying data source and loads all data returned from `range()` into memory.
|
||||
|
||||
```js
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r.region == v.provider + v.region)
|
||||
```
|
||||
|
||||
To dynamically set filters and maintain the pushdown ability of the `filter()` function,
|
||||
use variables to define filter values outside of `filter()`:
|
||||
|
||||
```js
|
||||
region = v.provider + v.region
|
||||
|
||||
from(bucket: "db/rp")
|
||||
|> range(start: -1h)
|
||||
|> filter(fn: (r) => r.region == region)
|
||||
```
|
||||
|
||||
## Avoid short window durations
|
||||
Windowing (grouping data based on time intervals) is commonly used to aggregate and downsample data.
|
||||
Increase performance by avoiding short window durations.
|
||||
More windows require more compute power to evaluate which window each row should be assigned to.
|
||||
Reasonable window durations depend on the total time range queried.
|
||||
|
||||
## Use "heavy" functions sparingly
|
||||
The following functions use more memory or CPU than others.
|
||||
Consider their necessity in your data processing before using them:
|
||||
|
||||
- [map()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/map/)
|
||||
- [reduce()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/reduce/)
|
||||
- [join()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/join/)
|
||||
- [union()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/union/)
|
||||
- [pivot()](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/pivot/)
|
||||
|
||||
## Use set() instead of map() when possible
|
||||
[`set()`](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/set/),
|
||||
[`experimental.set()`](/influxdb/v2.0/reference/flux/stdlib/experimental/set/),
|
||||
and [`map`](/influxdb/v2.0/reference/flux/stdlib/built-in/transformations/map/)
|
||||
can each set columns value in data, however **set** functions have performance
|
||||
advantages over `map()`.
|
||||
|
||||
Use the following guidelines to determine which to use:
|
||||
|
||||
- If setting a column value to a predefined, static value, use `set()` or `experimental.set()`.
|
||||
- If dynamically setting a column value using **existing row data**, use `map()`.
|
||||
|
||||
#### Set a column value to a static value
|
||||
The following queries are functionally the same, but using `set()` is more performant than using `map()`.
|
||||
|
||||
```js
|
||||
data
|
||||
|> map(fn: (r) => ({ r with foo: "bar" }))
|
||||
|
||||
// Recommended
|
||||
data
|
||||
|> set(key: "foo", value: "bar")
|
||||
```
|
||||
|
||||
#### Dynamically set a column value using existing row data
|
||||
```js
|
||||
data
|
||||
|> map(fn: (r) => ({ r with foo: r.bar }))
|
||||
```
|
||||
|
||||
## Balance time range and data precision
|
||||
To ensure queries are performant, balance the time range and the precision of your data.
|
||||
For example, if you query data stored every second and request six months worth of data,
|
||||
results would include ≈15.5 million points per series.
|
||||
Depending on the number of series returned after `filter()`([cardinality](/enterprise_influxdb/v1.9/concepts/glossary/#series-cardinality)),
|
||||
this can quickly become many billions of points.
|
||||
Flux must store these points in memory to generate a response.
|
||||
Use [pushdowns](#pushdown-functions-and-function-combinations) to optimize how
|
||||
many points are stored in memory.
|
||||
|
||||
## Measure query performance with Flux profilers
|
||||
Use the [Flux Profiler package](/influxdb/v2.0/reference/flux/stdlib/profiler/)
|
||||
to measure query performance and append performance metrics to your query output.
|
||||
The following Flux profilers are available:
|
||||
|
||||
- **query**: provides statistics about the execution of an entire Flux script.
|
||||
- **operator**: provides statistics about each operation in a query.
|
||||
|
||||
Import the `profiler` package and enable profilers with the `profile.enabledProfilers` option.
|
||||
|
||||
```js
|
||||
import "profiler"
|
||||
|
||||
option profiler.enabledProfilers = ["query", "operator"]
|
||||
|
||||
// Query to profile
|
||||
```
|
||||
|
||||
For more information about Flux profilers, see the [Flux Profiler package](/influxdb/v2.0/reference/flux/stdlib/profiler/).
|
|
@ -0,0 +1,12 @@
|
|||
---
|
||||
title: InfluxDB Enterprise guides
|
||||
description: Step-by-step guides for using InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/guides/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Guides
|
||||
weight: 60
|
||||
---
|
||||
|
||||
{{< children hlevel="h2" >}}
|
|
@ -0,0 +1,267 @@
|
|||
---
|
||||
title: Calculate percentages in a query
|
||||
description: >
|
||||
Calculate percentages using basic math operators available in InfluxQL or Flux.
|
||||
This guide walks through use-cases and examples of calculating percentages from two values in a single query.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 50
|
||||
parent: Guides
|
||||
name: Calculate percentages
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/guides/calculating_percentages/
|
||||
v2: /influxdb/v2.0/query-data/flux/calculate-percentages/
|
||||
---
|
||||
|
||||
Use Flux or InfluxQL to calculate percentages in a query.
|
||||
|
||||
{{< tabs-wrapper >}}
|
||||
{{% tabs %}}
|
||||
[Flux](#)
|
||||
[InfluxQL](#)
|
||||
{{% /tabs %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
|
||||
[Flux](/flux/latest/) lets you perform simple math equations, for example, calculating a percentage.
|
||||
|
||||
## Calculate a percentage
|
||||
|
||||
Learn how to calculate a percentage using the following examples:
|
||||
|
||||
- [Basic calculations within a query](#basic-calculations-within-a-query)
|
||||
- [Calculate a percentage from two fields](#calculate-a-percentage-from-two-fields)
|
||||
- [Calculate a percentage using aggregate functions](#calculate-a-percentage-using-aggregate-functions)
|
||||
- [Calculate the percentage of total weight per apple variety](#calculate-the-percentage-of-total-weight-per-apple-variety)
|
||||
- [Calculate the aggregate percentage per variety](#calculate-the-percentage-of-total-weight-per-apple-variety)
|
||||
|
||||
## Basic calculations within a query
|
||||
|
||||
When performing any math operation in a Flux query, you must complete the following steps:
|
||||
|
||||
1. Specify the [bucket](/{{< latest "influxdb" "v2" >}}/query-data/get-started/#buckets) to query from and the time range to query.
|
||||
2. Filter your data by measurements, fields, and other applicable criteria.
|
||||
3. Align values in one row (required to perform math in Flux) by using one of the following functions:
|
||||
- To query **from multiple** data sources, use the [`join()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/join/).
|
||||
- To query **from the same** data source, use the [`pivot()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/pivot/).
|
||||
|
||||
For examples using the `join()` function to calculate percentages and more examples of calculating percentages, see [Calculate percentages with Flux](/{{< latest "influxdb" "v2" >}}/query-data/flux/calculate-percentages/).
|
||||
|
||||
#### Data variable
|
||||
|
||||
To shorten examples, we'll store a basic Flux query in a `data` variable for reuse.
|
||||
|
||||
Here's how that looks in Flux:
|
||||
|
||||
```js
|
||||
// Query data from the past 15 minutes pivot fields into columns so each row
|
||||
// contains values for each field
|
||||
data = from(bucket:"your_db/your_retention_policy")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /field[1-2]/)
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
```
|
||||
|
||||
Each row now contains the values necessary to perform a math operation. For example, to add two field keys, start with the `data` variable created above, and then use `map()` to re-map values in each row.
|
||||
|
||||
```js
|
||||
data
|
||||
|> map(fn: (r) => ({ r with _value: r.field1 + r.field2}))
|
||||
```
|
||||
|
||||
> **Note:** Flux supports basic math operators such as `+`,`-`,`/`, `*`, and `()`. For example, to subtract `field2` from `field1`, change `+` to `-`.
|
||||
|
||||
## Calculate a percentage from two fields
|
||||
|
||||
Use the `data` variable created above, and then use the [`map()` function](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/map/) to divide one field by another, multiply by 100, and add a new `percent` field to store the percentage values in.
|
||||
|
||||
```js
|
||||
data
|
||||
|> map(fn: (r) => ({
|
||||
_time: r._time,
|
||||
_measurement: r._measurement,
|
||||
_field: "percent",
|
||||
_value: field1 / field2 * 100.0
|
||||
}))
|
||||
```
|
||||
|
||||
>**Note:** In this example, `field1` and `field2` are float values, hence multiplied by 100.0. For integer values, multiply by 100 or use the `float()` function to cast integers to floats.
|
||||
|
||||
## Calculate a percentage using aggregate functions
|
||||
|
||||
Use [`aggregateWindow()`](/{{< latest "influxdb" "v2" >}}/reference/flux/stdlib/built-in/transformations/aggregates/aggregatewindow) to window data by time and perform an aggregate function on each window.
|
||||
|
||||
```js
|
||||
from(bucket:"<database>/<retention_policy>")
|
||||
|> range(start: -15m)
|
||||
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /fieldkey[1-2]/)
|
||||
|> aggregateWindow(every: 1m, fn:sum)
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({ r with _value: r.field1 / r.field2 * 100.0 }))
|
||||
```
|
||||
|
||||
## Calculate the percentage of total weight per apple variety
|
||||
|
||||
Use simulated apple stand data to track the weight of apples (by type) throughout a day.
|
||||
|
||||
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
|
||||
2. Import the sample data:
|
||||
|
||||
```bash
|
||||
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
|
||||
```
|
||||
|
||||
Use the following query to calculate the percentage of the total weight each variety
|
||||
accounts for at each given point in time.
|
||||
|
||||
```js
|
||||
from(bucket:"apple_stand/autogen")
|
||||
|> range(start: 2018-06-18T12:00:00Z, stop: 2018-06-19T04:35:00Z)
|
||||
|> filter(fn: (r) => r._measurement == "variety")
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({ r with
|
||||
granny_smith: r.granny_smith / r.total_weight * 100.0 ,
|
||||
golden_delicious: r.golden_delicious / r.total_weight * 100.0 ,
|
||||
fuji: r.fuji / r.total_weight * 100.0 ,
|
||||
gala: r.gala / r.total_weight * 100.0 ,
|
||||
braeburn: r.braeburn / r.total_weight * 100.0 ,}))
|
||||
```
|
||||
|
||||
## Calculate the average percentage of total weight per variety each hour
|
||||
|
||||
With the apple stand data from the prior example, use the following query to calculate the average percentage of the total weight each variety accounts for per hour.
|
||||
|
||||
```js
|
||||
from(bucket:"apple_stand/autogen")
|
||||
|> range(start: 2018-06-18T00:00:00.00Z, stop: 2018-06-19T16:35:00.00Z)
|
||||
|> filter(fn: (r) => r._measurement == "variety")
|
||||
|> aggregateWindow(every:1h, fn: mean)
|
||||
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|
||||
|> map(fn: (r) => ({ r with
|
||||
granny_smith: r.granny_smith / r.total_weight * 100.0,
|
||||
golden_delicious: r.golden_delicious / r.total_weight * 100.0,
|
||||
fuji: r.fuji / r.total_weight * 100.0,
|
||||
gala: r.gala / r.total_weight * 100.0,
|
||||
braeburn: r.braeburn / r.total_weight * 100.0
|
||||
}))
|
||||
```
|
||||
|
||||
{{% /tab-content %}}
|
||||
|
||||
{{% tab-content %}}
|
||||
|
||||
[InfluxQL](/enterprise_influxdb/v1.9/query_language/) lets you perform simple math equations
|
||||
which makes calculating percentages using two fields in a measurement pretty simple.
|
||||
However there are some caveats of which you need to be aware.
|
||||
|
||||
## Basic calculations within a query
|
||||
|
||||
`SELECT` statements support the use of basic math operators such as `+`,`-`,`/`, `*`, `()`, etc.
|
||||
|
||||
```sql
|
||||
-- Add two field keys
|
||||
SELECT field_key1 + field_key2 AS "field_key_sum" FROM "measurement_name" WHERE time < now() - 15m
|
||||
|
||||
-- Subtract one field from another
|
||||
SELECT field_key1 - field_key2 AS "field_key_difference" FROM "measurement_name" WHERE time < now() - 15m
|
||||
|
||||
-- Grouping and chaining mathematical calculations
|
||||
SELECT (field_key1 + field_key2) - (field_key3 + field_key4) AS "some_calculation" FROM "measurement_name" WHERE time < now() - 15m
|
||||
```
|
||||
|
||||
## Calculating a percentage in a query
|
||||
|
||||
Using basic math functions, you can calculate a percentage by dividing one field value
|
||||
by another and multiplying the result by 100:
|
||||
|
||||
```sql
|
||||
SELECT (field_key1 / field_key2) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m
|
||||
```
|
||||
|
||||
## Calculating a percentage using aggregate functions
|
||||
|
||||
If using aggregate functions in your percentage calculation, all data must be referenced
|
||||
using aggregate functions.
|
||||
_**You can't mix aggregate and non-aggregate data.**_
|
||||
|
||||
All Aggregate functions need a `GROUP BY time()` clause defining the time intervals
|
||||
in which data points are grouped and aggregated.
|
||||
|
||||
```sql
|
||||
SELECT (sum(field_key1) / sum(field_key2)) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m GROUP BY time(1m)
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
#### Sample data
|
||||
|
||||
The following example uses simulated Apple Stand data that tracks the weight of
|
||||
baskets containing different varieties of apples throughout a day of business.
|
||||
|
||||
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
|
||||
2. Import the sample data:
|
||||
|
||||
```bash
|
||||
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
|
||||
```
|
||||
|
||||
### Calculating percentage of total weight per apple variety
|
||||
|
||||
The following query calculates the percentage of the total weight each variety
|
||||
accounts for at each given point in time.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
("braeburn"/total_weight)*100,
|
||||
("granny_smith"/total_weight)*100,
|
||||
("golden_delicious"/total_weight)*100,
|
||||
("fuji"/total_weight)*100,
|
||||
("gala"/total_weight)*100
|
||||
FROM "apple_stand"."autogen"."variety"
|
||||
```
|
||||
<div class='view-in-chronograf' data-query-override='SELECT
|
||||
("braeburn"/total_weight)*100,
|
||||
("granny_smith"/total_weight)*100,
|
||||
("golden_delicious"/total_weight)*100,
|
||||
("fuji"/total_weight)*100,
|
||||
("gala"/total_weight)*100
|
||||
FROM "apple_stand"."autogen"."variety"'>
|
||||
\*</div>
|
||||
|
||||
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
|
||||
in Chronograf, it would look like:
|
||||
|
||||

|
||||
|
||||
### Calculating aggregate percentage per variety
|
||||
|
||||
The following query calculates the average percentage of the total weight each variety
|
||||
accounts for per hour.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
(mean("braeburn")/mean(total_weight))*100,
|
||||
(mean("granny_smith")/mean(total_weight))*100,
|
||||
(mean("golden_delicious")/mean(total_weight))*100,
|
||||
(mean("fuji")/mean(total_weight))*100,
|
||||
(mean("gala")/mean(total_weight))*100
|
||||
FROM "apple_stand"."autogen"."variety"
|
||||
WHERE time >= '2018-06-18T12:00:00Z' AND time <= '2018-06-19T04:35:00Z'
|
||||
GROUP BY time(1h)
|
||||
```
|
||||
<div class='view-in-chronograf' data-query-override='SELECT%0A%20%20%20%20%28mean%28"braeburn"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"granny_smith"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"golden_delicious"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"fuji"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"gala"%29%2Fmean%28total_weight%29%29%2A100%0AFROM%20"apple_stand"."autogen"."variety"%0AWHERE%20time%20>%3D%20%272018-06-18T12%3A00%3A00Z%27%20AND%20time%20<%3D%20%272018-06-19T04%3A35%3A00Z%27%0AGROUP%20BY%20time%281h%29'></div>
|
||||
|
||||
_**Note the following about this query:**_
|
||||
|
||||
- It uses aggregate functions (`mean()`) for pulling all data.
|
||||
- It includes a `GROUP BY time()` clause which aggregates data into 1 hour blocks.
|
||||
- It includes an explicitly limited time window. Without it, aggregate functions
|
||||
are very resource-intensive.
|
||||
|
||||
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
|
||||
in Chronograf, it would look like:
|
||||
|
||||

|
||||
|
||||
{{% /tab-content %}}
|
||||
{{< /tabs-wrapper >}}
|
|
@ -0,0 +1,224 @@
|
|||
---
|
||||
title: Downsample and retain data
|
||||
description: Downsample data to keep high precision while preserving storage.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 30
|
||||
parent: Guides
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/guides/downsampling_and_retention/
|
||||
v2: /influxdb/v2.0/process-data/common-tasks/downsample-data/
|
||||
---
|
||||
|
||||
InfluxDB can handle hundreds of thousands of data points per second. Working with that much data over a long period of time can create storage concerns.
|
||||
A natural solution is to downsample the data; keep the high precision raw data for only a limited time, and store the lower precision, summarized data longer.
|
||||
This guide describes how to automate the process of downsampling data and expiring old data using InfluxQL. To downsample and retain data using Flux and InfluxDB 2.0,
|
||||
see [Process Data with InfluxDB tasks](/influxdb/v2.0/process-data/).
|
||||
|
||||
### Definitions
|
||||
|
||||
- **Continuous query** (CQ) is an InfluxQL query that runs automatically and periodically within a database.
|
||||
CQs require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
|
||||
|
||||
- **Retention policy** (RP) is the part of InfluxDB data structure that describes for how long InfluxDB keeps data.
|
||||
InfluxDB compares your local server's timestamp to the timestamps on your data and deletes data older than the RP's `DURATION`.
|
||||
A single database can have several RPs and RPs are unique per database.
|
||||
|
||||
This guide doesn't go into detail about the syntax for creating and managing CQs and RPs or tasks.
|
||||
If you're new to these concepts, we recommend reviewing the following:
|
||||
|
||||
- [CQ documentation](/enterprise_influxdb/v1.9/query_language/continuous_queries/) and
|
||||
- [RP documentation](/enterprise_influxdb/v1.9/query_language/manage-database/#retention-policy-management).
|
||||
|
||||
### Sample data
|
||||
|
||||
This section uses fictional real-time data to track the number of food orders
|
||||
to a restaurant via phone and via website at ten second intervals.
|
||||
We store this data in a [database](/enterprise_influxdb/v1.9/concepts/glossary/#database) or [bucket]() called `food_data`, in
|
||||
the [measurement](/enterprise_influxdb/v1.9/concepts/glossary/#measurement) `orders`, and
|
||||
in the [fields](/enterprise_influxdb/v1.9/concepts/glossary/#field) `phone` and `website`.
|
||||
|
||||
Sample:
|
||||
|
||||
```bash
|
||||
name: orders
|
||||
------------
|
||||
time phone website
|
||||
2016-05-10T23:18:00Z 10 30
|
||||
2016-05-10T23:18:10Z 12 39
|
||||
2016-05-10T23:18:20Z 11 56
|
||||
```
|
||||
|
||||
### Goal
|
||||
|
||||
Assume that, in the long run, we're only interested in the average number of orders by phone
|
||||
and by website at 30 minute intervals.
|
||||
In the next steps, we use RPs and CQs to:
|
||||
|
||||
* Automatically aggregate the ten-second resolution data to 30-minute resolution data
|
||||
* Automatically delete the raw, ten-second resolution data that are older than two hours
|
||||
* Automatically delete the 30-minute resolution data that are older than 52 weeks
|
||||
|
||||
### Database preparation
|
||||
|
||||
We perform the following steps before writing the data to the database
|
||||
`food_data`.
|
||||
We do this **before** inserting any data because CQs only run against recent
|
||||
data; that is, data with timestamps that are no older than `now()` minus
|
||||
the `FOR` clause of the CQ, or `now()` minus the `GROUP BY time()` interval if
|
||||
the CQ has no `FOR` clause.
|
||||
|
||||
#### 1. Create the database
|
||||
|
||||
```sql
|
||||
> CREATE DATABASE "food_data"
|
||||
```
|
||||
|
||||
#### 2. Create a two-hour `DEFAULT` retention policy
|
||||
|
||||
InfluxDB writes to the `DEFAULT` retention policy if we do not supply an explicit RP when
|
||||
writing a point to the database.
|
||||
We make the `DEFAULT` RP keep data for two hours, because we want InfluxDB to
|
||||
automatically write the incoming ten-second resolution data to that RP.
|
||||
|
||||
Use the
|
||||
[`CREATE RETENTION POLICY`](/enterprise_influxdb/v1.9/query_language/manage-database/#create-retention-policies-with-create-retention-policy)
|
||||
statement to create a `DEFAULT` RP:
|
||||
|
||||
```sql
|
||||
> CREATE RETENTION POLICY "two_hours" ON "food_data" DURATION 2h REPLICATION 1 DEFAULT
|
||||
```
|
||||
|
||||
That query creates an RP called `two_hours` that exists in the database
|
||||
`food_data`.
|
||||
`two_hours` keeps data for a `DURATION` of two hours (`2h`) and it's the `DEFAULT`
|
||||
RP for the database `food_data`.
|
||||
|
||||
{{% warn %}}
|
||||
The replication factor (`REPLICATION 1`) is a required parameter but must always
|
||||
be set to 1 for single node instances.
|
||||
{{% /warn %}}
|
||||
|
||||
> **Note:** When we created the `food_data` database in step 1, InfluxDB
|
||||
automatically generated an RP named `autogen` and set it as the `DEFAULT`
|
||||
RP for the database.
|
||||
The `autogen` RP has an infinite retention period.
|
||||
With the query above, the RP `two_hours` replaces `autogen` as the `DEFAULT` RP
|
||||
for the `food_data` database.
|
||||
|
||||
#### 3. Create a 52-week retention policy
|
||||
|
||||
Next we want to create another retention policy that keeps data for 52 weeks and is not the
|
||||
`DEFAULT` retention policy (RP) for the database.
|
||||
Ultimately, the 30-minute rollup data will be stored in this RP.
|
||||
|
||||
Use the
|
||||
[`CREATE RETENTION POLICY`](/enterprise_influxdb/v1.9/query_language/manage-database/#create-retention-policies-with-create-retention-policy)
|
||||
statement to create a non-`DEFAULT` retention policy:
|
||||
|
||||
```sql
|
||||
> CREATE RETENTION POLICY "a_year" ON "food_data" DURATION 52w REPLICATION 1
|
||||
```
|
||||
|
||||
That query creates a retention policy (RP) called `a_year` that exists in the database
|
||||
`food_data`.
|
||||
The `a_year` setting keeps data for a `DURATION` of 52 weeks (`52w`).
|
||||
Leaving out the `DEFAULT` argument ensures that `a_year` is not the `DEFAULT`
|
||||
RP for the database `food_data`.
|
||||
That is, write and read operations against `food_data` that do not specify an
|
||||
RP will still go to the `two_hours` RP (the `DEFAULT` RP).
|
||||
|
||||
#### 4. Create the continuous query
|
||||
|
||||
Now that we've set up our RPs, we want to create a continuous query (CQ) that will automatically
|
||||
and periodically downsample the ten-second resolution data to the 30-minute
|
||||
resolution, and then store those results in a different measurement with a different
|
||||
retention policy.
|
||||
|
||||
Use the
|
||||
[`CREATE CONTINUOUS QUERY`](/enterprise_influxdb/v1.9/query_language/continuous_queries/)
|
||||
statement to generate a CQ:
|
||||
|
||||
```sql
|
||||
> CREATE CONTINUOUS QUERY "cq_30m" ON "food_data" BEGIN
|
||||
SELECT mean("website") AS "mean_website",mean("phone") AS "mean_phone"
|
||||
INTO "a_year"."downsampled_orders"
|
||||
FROM "orders"
|
||||
GROUP BY time(30m)
|
||||
END
|
||||
```
|
||||
|
||||
That query creates a CQ called `cq_30m` in the database `food_data`.
|
||||
`cq_30m` tells InfluxDB to calculate the 30-minute average of the two fields
|
||||
`website` and `phone` in the measurement `orders` and in the `DEFAULT` RP
|
||||
`two_hours`.
|
||||
It also tells InfluxDB to write those results to the measurement
|
||||
`downsampled_orders` in the retention policy `a_year` with the field keys
|
||||
`mean_website` and `mean_phone`.
|
||||
InfluxDB will run this query every 30 minutes for the previous 30 minutes.
|
||||
|
||||
> **Note:** Notice that we fully qualify (that is, we use the syntax
|
||||
`"<retention_policy>"."<measurement>"`) the measurement in the `INTO`
|
||||
clause.
|
||||
InfluxDB requires that syntax to write data to an RP other than the `DEFAULT`
|
||||
RP.
|
||||
|
||||
### Results
|
||||
|
||||
With the new CQ and two new RPs, `food_data` is ready to start receiving data.
|
||||
After writing data to our database and letting things run for a bit, we see
|
||||
two measurements: `orders` and `downsampled_orders`.
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "orders" LIMIT 5
|
||||
name: orders
|
||||
---------
|
||||
time phone website
|
||||
2016-05-13T23:00:00Z 10 30
|
||||
2016-05-13T23:00:10Z 12 39
|
||||
2016-05-13T23:00:20Z 11 56
|
||||
2016-05-13T23:00:30Z 8 34
|
||||
2016-05-13T23:00:40Z 17 32
|
||||
|
||||
> SELECT * FROM "a_year"."downsampled_orders" LIMIT 5
|
||||
name: downsampled_orders
|
||||
---------------------
|
||||
time mean_phone mean_website
|
||||
2016-05-13T15:00:00Z 12 23
|
||||
2016-05-13T15:30:00Z 13 32
|
||||
2016-05-13T16:00:00Z 19 21
|
||||
2016-05-13T16:30:00Z 3 26
|
||||
2016-05-13T17:00:00Z 4 23
|
||||
```
|
||||
|
||||
The data in `orders` are the raw, ten-second resolution data that reside in the
|
||||
two-hour RP.
|
||||
The data in `downsampled_orders` are the aggregated, 30-minute resolution data
|
||||
that are subject to the 52-week RP.
|
||||
|
||||
Notice that the first timestamps in `downsampled_orders` are older than the first
|
||||
timestamps in `orders`.
|
||||
This is because InfluxDB has already deleted data from `orders` with timestamps
|
||||
that are older than our local server's timestamp minus two hours (assume we
|
||||
executed the `SELECT` queries at `2016-05-14T00:59:59Z`).
|
||||
InfluxDB will only start dropping data from `downsampled_orders` after 52 weeks.
|
||||
|
||||
> **Notes:**
|
||||
>
|
||||
* Notice that we fully qualify (that is, we use the syntax
|
||||
`"<retention_policy>"."<measurement>"`) `downsampled_orders` in
|
||||
the second `SELECT` statement. We must specify the RP in that query to `SELECT`
|
||||
data that reside in an RP other than the `DEFAULT` RP.
|
||||
>
|
||||
* By default, InfluxDB checks to enforce an RP every 30 minutes.
|
||||
Between checks, `orders` may have data that are older than two hours.
|
||||
The rate at which InfluxDB checks to enforce an RP is a configurable setting,
|
||||
see
|
||||
[Database Configuration](/enterprise_influxdb/v1.9/administration/config#check-interval-30m0s).
|
||||
|
||||
Using a combination of RPs and CQs, we've successfully set up our database to
|
||||
automatically keep the high precision raw data for a limited time, create lower
|
||||
precision data, and store that lower precision data for a longer period of time.
|
||||
Now that you have a general understanding of how these features can work
|
||||
together, check out the detailed documentation on [CQs](/enterprise_influxdb/v1.9/query_language/continuous_queries/) and [RPs](/enterprise_influxdb/v1.9/query_language/manage-database/#retention-policy-management)
|
||||
to see all that they can do for you.
|
|
@ -0,0 +1,625 @@
|
|||
---
|
||||
title: Use fine-grained authorization in InfluxDB Enterprise
|
||||
description: >
|
||||
Fine-grained authorization (FGA) in InfluxDB Enterprise controls user access at the database, measurement, and series levels.
|
||||
alias:
|
||||
-/docs/v1.5/administration/fga
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Use fine-grained authorization
|
||||
weight: 10
|
||||
parent: Guides
|
||||
---
|
||||
|
||||
Use fine-grained authorization (FGA) in InfluxDB Enterprise to control user access at the database, measurement, and series levels.
|
||||
|
||||
> **Note:** InfluxDB OSS controls access at the database level only.
|
||||
|
||||
You must have [admin permissions](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#admin-user-management) to set up FGA.
|
||||
|
||||
## Set up fine-grained authorization
|
||||
|
||||
1. [Enable authentication](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#set-up-authentication) in your InfluxDB configuration file.
|
||||
|
||||
2. Create users through the InfluxDB query API.
|
||||
|
||||
```sql
|
||||
CREATE USER username WITH PASSWORD 'password'
|
||||
```
|
||||
|
||||
For more information, see [User management commands](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#user-management-commands).
|
||||
|
||||
3. Ensure that you can access the **meta node** API (port 8091 by default).
|
||||
|
||||
> In a typical cluster configuration, the HTTP ports for data nodes
|
||||
> (8086 by default) are exposed to clients but the meta node HTTP ports are not.
|
||||
> You may need to work with your network administrator to gain access to the meta node HTTP ports.
|
||||
|
||||
4. _(Optional)_ [Create roles](#manage-roles).
|
||||
Roles let you grant permissions to groups of users assigned to each role.
|
||||
|
||||
> For an overview of how users and roles work in InfluxDB Enterprise, see [InfluxDB Enterprise users](/enterprise_influxdb/v1.9/features/users/).
|
||||
|
||||
5. [Set up restrictions](#manage-restrictions).
|
||||
Restrictions apply to all non-admin users.
|
||||
|
||||
> Permissions (currently "read" and "write") may be restricted independently depending on the scenario.
|
||||
|
||||
7. [Set up grants](#manage-grants) to remove restrictions for specified users and roles.
|
||||
|
||||
---
|
||||
|
||||
{{% note %}}
|
||||
#### Notes about examples
|
||||
The examples below use `curl`, a command line tool for transferring data, to send
|
||||
HTTP requests to the Meta API, and [`jq`](https://stedolan.github.io/jq/), a command line JSON processor,
|
||||
to make the JSON output easier to read.
|
||||
Alternatives for each are available, but are not covered in this documentation.
|
||||
|
||||
All examples assume authentication is enabled in InfluxDB.
|
||||
Admin credentials must be sent with each request.
|
||||
Use the `curl -u` flag to pass authentication credentials:
|
||||
|
||||
```sh
|
||||
curl -u `username:password` #...
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
---
|
||||
|
||||
## Matching methods
|
||||
The following matching methods are available when managing restrictions and grants to databases, measurements, or series:
|
||||
|
||||
- `exact` (matches only exact string matches)
|
||||
- `prefix` (matches strings the begin with a specified prefix)
|
||||
|
||||
```sh
|
||||
# Match a database name exactly
|
||||
"database": {"match": "exact", "value": "my_database"}
|
||||
|
||||
# Match any databases that begin with "my_"
|
||||
"database": {"match": "prefix", "value": "my_"}
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
#### Wildcard matching
|
||||
Neither `exact` nor `prefix` matching methods allow for wildcard matching.
|
||||
{{% /note %}}
|
||||
|
||||
## Manage roles
|
||||
Roles allow you to assign permissions to groups of users.
|
||||
The following examples assume the `user1`, `user2` and `ops` users already exist in InfluxDB.
|
||||
|
||||
### Create a role
|
||||
To create a new role, use the InfluxDB Meta API `/role` endpoint with the `action`
|
||||
field set to `create` in the request body.
|
||||
|
||||
The following examples create two new roles:
|
||||
|
||||
- east
|
||||
- west
|
||||
|
||||
```sh
|
||||
# Create east role
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "create",
|
||||
"role": {
|
||||
"name": "east"
|
||||
}
|
||||
}'
|
||||
|
||||
# Create west role
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "create",
|
||||
"role": {
|
||||
"name": "west"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Specify role permissions
|
||||
To specify permissions for a role,
|
||||
use the InfluxDB Meta API `/role` endpoint with the `action` field set to `add-permissions`.
|
||||
Specify the [permissions](/chronograf/v1.8/administration/managing-influxdb-users/#permissions) to add for each database.
|
||||
|
||||
The following example sets read and write permissions on `db1` for both `east` and `west` roles.
|
||||
|
||||
```sh
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-permissions",
|
||||
"role": {
|
||||
"name": "east",
|
||||
"permissions": {
|
||||
"db1": ["ReadData", "WriteData"]
|
||||
}
|
||||
}
|
||||
}'
|
||||
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-permissions",
|
||||
"role": {
|
||||
"name": "west",
|
||||
"permissions": {
|
||||
"db1": ["ReadData", "WriteData"]
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Remove role permissions
|
||||
To remove permissions from a role, use the InfluxDB Meta API `/role` endpoint with the `action` field
|
||||
set to `remove-permissions`.
|
||||
Specify the [permissions](/{{< latest "chronograf" >}}/administration/managing-influxdb-users/#permissions) to remove from each database.
|
||||
|
||||
The following example removes read and write permissions from `db1` for the `east` role.
|
||||
|
||||
```sh
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "remove-permissions",
|
||||
"role": {
|
||||
"name": "east",
|
||||
"permissions": {
|
||||
"db1": ["ReadData", "WriteData"]
|
||||
}
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### Assign users to a role
|
||||
To assign users to role, set the `action` field to `add-users` and include a list
|
||||
of users in the `role` field.
|
||||
|
||||
The following examples add user1, user2 and the ops user to the `east` and `west` roles.
|
||||
|
||||
```sh
|
||||
# Add user1 and ops to the east role
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-users",
|
||||
"role": {
|
||||
"name": "east",
|
||||
"users": ["user1", "ops"]
|
||||
}
|
||||
}'
|
||||
|
||||
# Add user1 and ops to the west role
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "add-users",
|
||||
"role": {
|
||||
"name": "west",
|
||||
"users": ["user2", "ops"]
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
### View existing roles
|
||||
To view existing roles with their assigned permissions and users, use the `GET`
|
||||
request method with the InfluxDB Meta API `/role` endpoint.
|
||||
|
||||
```sh
|
||||
curl -L -XGET http://localhost:8091/role | jq
|
||||
```
|
||||
|
||||
### Delete a role
|
||||
To delete a role, the InfluxDB Meta API `/role` endpoint and set the `action`
|
||||
field to `delete` and include the name of the role to delete.
|
||||
|
||||
```sh
|
||||
curl -s -L -XPOST "http://localhost:8091/role" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"action": "delete",
|
||||
"role": {
|
||||
"name": "west"
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
Deleting a role does not delete users assigned to the role.
|
||||
{{% /note %}}
|
||||
|
||||
## Manage restrictions
|
||||
Restrictions restrict either or both read and write permissions on InfluxDB assets.
|
||||
Restrictions apply to all non-admin users.
|
||||
[Grants](#manage-grants) override restrictions.
|
||||
|
||||
> In order to run meta queries (such as `SHOW MEASUREMENTS` or `SHOW TAGS` ),
|
||||
> users must have read permissions for the database and retention policy they are querying.
|
||||
|
||||
Manage restrictions using the InfluxDB Meta API `acl/restrictions` endpoint.
|
||||
|
||||
```sh
|
||||
curl -L -XGET "http://localhost:8091/influxdb/v2/acl/restrictions"
|
||||
```
|
||||
|
||||
- [Restrict by database](#restrict-by-database)
|
||||
- [Restrict by measurement in a database](#restrict-by-measurement-in-a-database)
|
||||
- [Restrict by series in a database](#restrict-by-series-in-a-database)
|
||||
- [View existing restrictions](#view-existing-restrictions)
|
||||
- [Update a restriction](#update-a-restriction)
|
||||
- [Remove a restriction](#remove-a-restriction)
|
||||
|
||||
> **Note:** For the best performance, set up minimal restrictions.
|
||||
|
||||
### Restrict by database
|
||||
In most cases, restricting the database is the simplest option, and has minimal impact on performance.
|
||||
The following example restricts reads and writes on the `my_database` database.
|
||||
|
||||
```sh
|
||||
curl -L -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
### Restrict by measurement in a database
|
||||
The following example restricts read and write permissions on the `network`
|
||||
measurement in the `my_database` database.
|
||||
_This restriction does not apply to other measurements in the `my_database` database._
|
||||
|
||||
```sh
|
||||
curl -L -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
### Restrict by series in a database
|
||||
The most fine-grained restriction option is to restrict specific tags in a measurement and database.
|
||||
The following example restricts read and write permissions on the `datacenter=east` tag in the
|
||||
`network` measurement in the `my_database` database.
|
||||
_This restriction does not apply to other tags or tag values in the `network` measurement._
|
||||
|
||||
```sh
|
||||
curl -L -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
_Consider this option carefully, as it allows writes to `network` without tags or
|
||||
writes to `network` with a tag key of `datacenter` and a tag value of anything but `east`._
|
||||
|
||||
##### Apply restrictions to a series defined by multiple tags
|
||||
```sh
|
||||
curl -L -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [
|
||||
{"match": "exact", "key": "tag1", "value": "value1"},
|
||||
{"match": "exact", "key": "tag2", "value": "value2"}
|
||||
],
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
```
|
||||
|
||||
{{% note %}}
|
||||
#### Create multiple restrictions at a time
|
||||
There may be times where you need to create restrictions using unique values for each.
|
||||
To create multiple restrictions for a list of values, use a bash `for` loop:
|
||||
|
||||
```sh
|
||||
for value in val1 val2 val3 val4; do
|
||||
curl -L -XPOST "http://localhost:8091/influxdb/v2/acl/restrictions" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "'$value'"}],
|
||||
"permissions": ["read", "write"]
|
||||
}'
|
||||
done
|
||||
```
|
||||
{{% /note %}}
|
||||
|
||||
### View existing restrictions
|
||||
To view existing restrictions, use the `GET` request method with the `acl/restrictions` endpoint.
|
||||
|
||||
```sh
|
||||
curl -L -u "admin-username:admin-password" -XGET "http://localhost:8091/influxdb/v2/acl/restrictions" | jq
|
||||
```
|
||||
|
||||
### Update a restriction
|
||||
_You can not directly modify a restriction.
|
||||
Delete the existing restriction and create a new one with updated parameters._
|
||||
|
||||
### Remove a restriction
|
||||
To remove a restriction, obtain the restriction ID using the `GET` request method
|
||||
with the `acl/restrictions` endpoint.
|
||||
Use the `DELETE` request method to delete a restriction by ID.
|
||||
|
||||
```sh
|
||||
# Obtain the restriction ID from the list of restrictions
|
||||
curl -L -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/restrictions" | jq
|
||||
|
||||
# Delete the restriction using the restriction ID
|
||||
curl -L -u "admin-username:admin-password" \
|
||||
-XDELETE "http://localhost:8091/influxdb/v2/acl/restrictions/<restriction_id>"
|
||||
```
|
||||
|
||||
## Manage grants
|
||||
Grants remove restrictions and grant users or roles either or both read and write
|
||||
permissions on InfluxDB assets.
|
||||
|
||||
Manage grants using the InfluxDB Meta API `acl/grants` endpoint.
|
||||
|
||||
```sh
|
||||
curl -L -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/grants"
|
||||
```
|
||||
|
||||
- [Grant permissions by database](#grant-permissions-by-database)
|
||||
- [Grant permissions by measurement in a database](#grant-permissions-by-measurement-in-a-database)
|
||||
- [Grant permissions by series in a database](#grant-permissions-by-series-in-a-database)
|
||||
- [View existing grants](#view-existing-grants)
|
||||
- [Update a grant](#update-a-grant)
|
||||
- [Remove a grant](#remove-a-grant)
|
||||
|
||||
### Grant permissions by database
|
||||
The following examples grant read and write permissions on the `my_database` database.
|
||||
|
||||
> **Note:** This offers no guarantee that the users will write to the correct measurement or use the correct tags.
|
||||
|
||||
##### Grant database-level permissions to users
|
||||
```sh
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"permissions": ["read", "write"],
|
||||
"users": [
|
||||
{"name": "user1"},
|
||||
{"name": "user2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
##### Grant database-level permissions to roles
|
||||
```sh
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [
|
||||
{"name": "role1"},
|
||||
{"name": "role2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
### Grant permissions by measurement in a database
|
||||
The following examples grant permissions to the `network` measurement in the `my_database` database.
|
||||
These grants do not apply to other measurements in the `my_database` database nor
|
||||
guarantee that users will use the correct tags.
|
||||
|
||||
##### Grant measurement-level permissions to users
|
||||
```sh
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"permissions": ["read", "write"],
|
||||
"users": [
|
||||
{"name": "user1"},
|
||||
{"name": "user2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
To grant access for roles, run:
|
||||
|
||||
```sh
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [
|
||||
{"name": "role1"},
|
||||
{"name": "role2"}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
### Grant permissions by series in a database
|
||||
|
||||
The following examples grant access only to data with the corresponding `datacenter` tag.
|
||||
_Neither guarantees the users will use the `network` measurement._
|
||||
|
||||
##### Grant series-level permissions to users
|
||||
```sh
|
||||
# Grant user1 read/write permissions on data with the 'datacenter=east' tag set.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user1"}]
|
||||
}'
|
||||
|
||||
# Grant user2 read/write permissions on data with the 'datacenter=west' tag set.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
##### Grant series-level permissions to roles
|
||||
```sh
|
||||
# Grant role1 read/write permissions on data with the 'datacenter=east' tag set.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role1"}]
|
||||
}'
|
||||
|
||||
# Grant role2 read/write permissions on data with the 'datacenter=west' tag set.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
### Grant access to specific series in a measurement
|
||||
The following examples grant read and write permissions to corresponding `datacenter`
|
||||
tags in the `network` measurement.
|
||||
_They each specify the measurement in the request body._
|
||||
|
||||
##### Grant series-level permissions in a measurement to users
|
||||
```sh
|
||||
# Grant user1 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user1"}]
|
||||
}'
|
||||
|
||||
# Grant user2 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"users": [{"name": "user2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
##### Grant series-level permissions in a measurement to roles
|
||||
```sh
|
||||
# Grant role1 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "east"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role1"}]
|
||||
}'
|
||||
|
||||
# Grant role2 read/write permissions on data with the 'datacenter=west' tag set
|
||||
# inside the 'network' measurement.
|
||||
curl -s -L -XPOST "http://localhost:8091/influxdb/v2/acl/grants" \
|
||||
-u "admin-username:admin-password" \
|
||||
-H "Content-Type: application/json" \
|
||||
--data-binary '{
|
||||
"database": {"match": "exact", "value": "my_database"},
|
||||
"measurement": {"match": "exact", "value": "network"},
|
||||
"tags": [{"match": "exact", "key": "datacenter", "value": "west"}],
|
||||
"permissions": ["read", "write"],
|
||||
"roles": [{"name": "role2"}]
|
||||
}'
|
||||
```
|
||||
|
||||
Grants for specific series also apply to [meta queries](/enterprise_influxdb/v1.9/query_language/schema_exploration).
|
||||
Results from meta queries are restricted based on series-level permissions.
|
||||
For example, `SHOW TAG VALUES` only returns tag values that the user is authorized to see.
|
||||
|
||||
With these grants in place, a user or role can only read or write data from or to
|
||||
the `network` measurement if the data includes the appropriate `datacenter` tag set.
|
||||
|
||||
{{% note %}}
|
||||
Note that this is only the requirement of the presence of that tag;
|
||||
`datacenter=east,foo=bar` will also be accepted.
|
||||
{{% /note %}}
|
||||
|
||||
### View existing grants
|
||||
To view existing grants, use the `GET` request method with the `acl/grants` endpoint.
|
||||
|
||||
```sh
|
||||
curl -L -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/grants" | jq
|
||||
```
|
||||
|
||||
### Update a grant
|
||||
_You can not directly modify a grant.
|
||||
Delete the existing grant and create a new one with updated parameters._
|
||||
|
||||
### Remove a grant
|
||||
To delete a grant, obtain the grant ID using the `GET` request method with the
|
||||
`acl/grants` endpoint.
|
||||
Use the `DELETE` request method to delete a grant by ID.
|
||||
|
||||
```sh
|
||||
# Obtain the grant ID from the list of grants
|
||||
curl -L -u "admin-username:admin-password" \
|
||||
-XGET "http://localhost:8091/influxdb/v2/acl/grants" | jq
|
||||
|
||||
# Delete the grant using the grant ID
|
||||
curl -L -u "admin-username:admin-password" \
|
||||
-XDELETE "http://localhost:8091/influxdb/v2/acl/grants/<grant_id>"
|
||||
```
|
|
@ -0,0 +1,296 @@
|
|||
---
|
||||
title: Enable HTTPS for InfluxDB Enterprise
|
||||
description: >
|
||||
Enabling HTTPS encrypts the communication between clients and the InfluxDB Enterprise server, and between nodes in the cluster.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Enable HTTPS
|
||||
weight: 100
|
||||
parent: Guides
|
||||
---
|
||||
|
||||
Enabling HTTPS encrypts the communication between clients and the InfluxDB Enterprise server, and between nodes in the cluster.
|
||||
When configured with a signed certificate, HTTPS can also verify the authenticity of the InfluxDB Enterprise server to connecting clients.
|
||||
|
||||
This pages outlines how to set up HTTPS with InfluxDB Enterprise using either a signed or self-signed certificate.
|
||||
|
||||
{{% warn %}}
|
||||
InfluxData **strongly recommends** enabling HTTPS, especially if you plan on sending requests to InfluxDB Enterprise over a network.
|
||||
{{% /warn %}}
|
||||
|
||||
{{% note %}}
|
||||
These steps have been tested on Debian-based Linux distributions.
|
||||
Specific steps may vary on other operating systems.
|
||||
{{% /note %}}
|
||||
|
||||
## Requirements
|
||||
|
||||
To enable HTTPS with InfluxDB Enterprise, you need a Transport Layer Security (TLS) certificate, also known as a Secured Sockets Layer (SSL) certificate.
|
||||
InfluxDB supports three types of TLS certificates:
|
||||
|
||||
* **Single domain certificates signed by a [Certificate Authority](https://en.wikipedia.org/wiki/Certificate_authority)**
|
||||
|
||||
Single domain certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
|
||||
These certificates are signed and issued by a trusted, third-party Certificate Authority (CA).
|
||||
With this certificate option, every InfluxDB instance requires a unique single domain certificate.
|
||||
|
||||
* **Wildcard certificates signed by a Certificate Authority**
|
||||
|
||||
These certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
|
||||
Wildcard certificates can be used across multiple InfluxDB Enterprise instances on different servers.
|
||||
|
||||
* **Self-signed certificates**
|
||||
|
||||
Self-signed certificates are _not_ signed by a trusted, third-party CA.
|
||||
Unlike CA-signed certificates, self-signed certificates only provide cryptographic security to HTTPS requests.
|
||||
They do not allow clients to verify the identity of the InfluxDB server.
|
||||
With this certificate option, every InfluxDB Enterprise instance requires a unique self-signed certificate.
|
||||
You can generate a self-signed certificate on your own machine.
|
||||
|
||||
Regardless of your certificate's type, InfluxDB Enterprise supports certificates composed of
|
||||
a private key file (`.key`) and a signed certificate file (`.crt`) file pair, as well as certificates
|
||||
that combine the private key file and the signed certificate file into a single bundled file (`.pem`).
|
||||
|
||||
## Set up HTTPS in an InfluxDB Enterprise cluster
|
||||
|
||||
1. **Download or generate certificate files**
|
||||
|
||||
If using a certificate provided by a CA, follow their instructions to download the certificate files.
|
||||
|
||||
{{% note %}}
|
||||
If using one or more self-signed certificates, use the `openssl` utility to create a certificate.
|
||||
The following command generates a private key file (`.key`) and a self-signed
|
||||
certificate file (`.crt`) which remain valid for the specified `NUMBER_OF_DAYS`.
|
||||
|
||||
```sh
|
||||
sudo openssl req -x509 -nodes -newkey rsa:2048 \
|
||||
-keyout influxdb-selfsigned.key \
|
||||
-out influxdb-selfsigned.crt \
|
||||
-days <NUMBER_OF_DAYS>
|
||||
```
|
||||
|
||||
The command will prompt you for more information.
|
||||
You can choose to fill out these fields or leave them blank; both actions generate valid certificate files.
|
||||
|
||||
In subsequent steps, you will need to copy the certificate and key (or `.pem` file) to each node in the cluster.
|
||||
{{% /note %}}
|
||||
|
||||
2. **Install the SSL/TLS certificate in each Node**
|
||||
|
||||
Place the private key file (`.key`) and the signed certificate file (`.crt`)
|
||||
or the single bundled file (`.pem`)
|
||||
in the `/etc/ssl/` directory of each meta node and data node.
|
||||
|
||||
{{% note %}}
|
||||
Some Certificate Authorities provide certificate files with other extensions.
|
||||
Consult your CA if you are unsure about how to use these files.
|
||||
{{% /note %}}
|
||||
|
||||
3. **Ensure file permissions for each Node**
|
||||
|
||||
Certificate files require read and write access by the `influxdb` user.
|
||||
Ensure that you have the correct file permissions in each meta node and data node by running the following commands:
|
||||
|
||||
```sh
|
||||
sudo chown influxdb:influxdb /etc/ssl/
|
||||
sudo chmod 644 /etc/ssl/<CA-certificate-file>
|
||||
sudo chmod 600 /etc/ssl/<private-key-file>
|
||||
```
|
||||
|
||||
4. **Enable HTTPS within the configuration file for each meta node**
|
||||
|
||||
Enable HTTPS for each meta node within the `[meta]` section of the meta node configuration file (`influxdb-meta.conf`) by setting:
|
||||
|
||||
```toml
|
||||
[meta]
|
||||
|
||||
[...]
|
||||
|
||||
# Determines whether HTTPS is enabled.
|
||||
https-enabled = true
|
||||
|
||||
# The SSL certificate to use when HTTPS is enabled.
|
||||
https-certificate = "influxdb-meta.crt"
|
||||
|
||||
# Use a separate private key location.
|
||||
https-private-key = "influxdb-meta.key"
|
||||
|
||||
# If using a self-signed certificate:
|
||||
https-insecure-tls = true
|
||||
|
||||
# Use TLS when communicating with data notes
|
||||
data-use-tls = true
|
||||
data-insecure-tls = true
|
||||
|
||||
```
|
||||
|
||||
5. **Enable HTTPS within the configuration file for each data node**
|
||||
|
||||
Make the following sets of changes in the configuration file (`influxdb.conf`) on each data node:
|
||||
|
||||
1. Enable HTTPS for each data node within the `[http]` section of the configuration file by setting:
|
||||
|
||||
```toml
|
||||
[http]
|
||||
|
||||
[...]
|
||||
|
||||
# Determines whether HTTPS is enabled.
|
||||
https-enabled = true
|
||||
|
||||
[...]
|
||||
|
||||
# The SSL certificate to use when HTTPS is enabled.
|
||||
https-certificate = "influxdb-data.crt"
|
||||
|
||||
# Use a separate private key location.
|
||||
https-private-key = "influxdb-data.key"
|
||||
```
|
||||
|
||||
2. Configure the data nodes to use HTTPS when communicating with other data nodes.
|
||||
In the `[cluster]` section of the configuration file, set the following:
|
||||
|
||||
```toml
|
||||
[cluster]
|
||||
|
||||
[...]
|
||||
|
||||
# Determines whether data nodes use HTTPS to communicate with each other.
|
||||
https-enabled = true
|
||||
|
||||
# The SSL certificate to use when HTTPS is enabled.
|
||||
https-certificate = "influxdb-data.crt"
|
||||
|
||||
# Use a separate private key location.
|
||||
https-private-key = "influxdb-data.key"
|
||||
|
||||
# If using a self-signed certificate:
|
||||
https-insecure-tls = true
|
||||
```
|
||||
|
||||
3. Configure the data nodes to use HTTPS when communicating with the meta nodes.
|
||||
In the `[meta]` section of the configuration file, set the following:
|
||||
|
||||
```toml
|
||||
[meta]
|
||||
|
||||
[...]
|
||||
meta-tls-enabled = true
|
||||
|
||||
# If using a self-signed certificate:
|
||||
meta-insecure-tls = true
|
||||
```
|
||||
|
||||
6. **Restart InfluxDB Enterprise**
|
||||
|
||||
Restart the InfluxDB Enterprise processes for the configuration changes to take effect:
|
||||
|
||||
```sh
|
||||
sudo systemctl restart influxdb-meta
|
||||
```
|
||||
|
||||
Restart the InfluxDB Enterprise data node processes for the configuration changes to take effect:
|
||||
|
||||
```sh
|
||||
sudo systemctl restart influxdb
|
||||
```
|
||||
|
||||
7. **Verify the HTTPS Setup**
|
||||
|
||||
Verify that HTTPS is working on the meta nodes by using `influxd-ctl`.
|
||||
|
||||
```sh
|
||||
influxd-ctl -bind-tls show
|
||||
```
|
||||
|
||||
If using a self-signed certificate, use:
|
||||
|
||||
```sh
|
||||
influxd-ctl -bind-tls -k show
|
||||
```
|
||||
|
||||
{{% warn %}}
|
||||
Once you have enabled HTTPS, you must use `-bind-tls` in order for `influxd-ctl` to connect to the meta node.
|
||||
With a self-signed certificate, you must also use the `-k` option to skip certificate verification.
|
||||
{{% /warn %}}
|
||||
|
||||
A successful connection returns output which should resemble the following:
|
||||
|
||||
```
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 enterprise-data-01:8088 1.x.y-c1.x.y
|
||||
5 enterprise-data-02:8088 1.x.y-c1.x.y
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 1.x.y-c1.x.z
|
||||
enterprise-meta-02:8091 1.x.y-c1.x.z
|
||||
enterprise-meta-03:8091 1.x.y-c1.x.z
|
||||
```
|
||||
|
||||
Next, verify that HTTPS is working by connecting to InfluxDB Enterprise with the [`influx` command line interface](/enterprise_influxdb/v1.9/tools/use-influx/):
|
||||
|
||||
```sh
|
||||
influx -ssl -host <domain_name>.com
|
||||
```
|
||||
|
||||
If using a self-signed certificate, use
|
||||
|
||||
```sh
|
||||
influx -ssl -unsafeSsl -host <domain_name>.com
|
||||
```
|
||||
|
||||
A successful connection returns the following:
|
||||
|
||||
```sh
|
||||
Connected to https://<domain_name>.com:8086 version 1.x.y
|
||||
InfluxDB shell version: 1.x.y
|
||||
>
|
||||
```
|
||||
|
||||
That's it! You've successfully set up HTTPS with InfluxDB Enterprise.
|
||||
|
||||
## Connect Telegraf to a secured InfluxDB Enterprise instance
|
||||
|
||||
Connecting [Telegraf](/{{< latest "telegraf" >}}/)
|
||||
to an HTTPS-enabled InfluxDB Enterprise instance requires some additional steps.
|
||||
|
||||
In Telegraf's configuration file (`/etc/telegraf/telegraf.conf`), under the OUTPUT PLUGINS section,
|
||||
edit the `urls` setting to indicate `https` instead of `http`.
|
||||
Also change `localhost` to the relevant domain name.
|
||||
|
||||
The best practice in terms of security is to transfer the certificate to the client and make it trusted
|
||||
(either by putting in the operating system's trusted certificate system or using the `ssl_ca` option).
|
||||
The alternative is to sign the certificate using an internal CA and then trust the CA certificate.
|
||||
|
||||
If you're using a self-signed certificate,
|
||||
uncomment the `insecure_skip_verify` setting and set it to `true`.
|
||||
|
||||
```toml
|
||||
###############################################################################
|
||||
# OUTPUT PLUGINS #
|
||||
###############################################################################
|
||||
|
||||
# Configuration for influxdb server to send metrics to
|
||||
[[outputs.influxdb]]
|
||||
## The full HTTP or UDP endpoint URL for your InfluxDB Enterprise instance.
|
||||
## Multiple urls can be specified as part of the same cluster,
|
||||
## this means that only ONE of the urls will be written to each interval.
|
||||
# urls = ["udp://localhost:8089"] # UDP endpoint example
|
||||
urls = ["https://<domain_name>.com:8086"]
|
||||
|
||||
[...]
|
||||
|
||||
## Optional SSL Config
|
||||
[...]
|
||||
insecure_skip_verify = true # <-- Update only if you're using a self-signed certificate
|
||||
```
|
||||
|
||||
Next, restart Telegraf and you're all set!
|
||||
|
||||
```sh
|
||||
sudo systemctl restart telegraf
|
||||
```
|
|
@ -0,0 +1,111 @@
|
|||
---
|
||||
title: Query data with the InfluxDB API
|
||||
description: Query data with Flux and InfluxQL in the InfluxDB API.
|
||||
alias:
|
||||
-/docs/v1.8/query_language/querying_data/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 20
|
||||
parent: Guides
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/guides/querying_data/
|
||||
v2: /influxdb/v2.0/query-data/
|
||||
---
|
||||
|
||||
|
||||
The InfluxDB API is the primary means for querying data in InfluxDB (see the [command line interface](/enterprise_influxdb/v1.9/tools/use-influx/) and [client libraries](/enterprise_influxdb/v1.9/tools/api_client_libraries/) for alternative ways to query the database).
|
||||
|
||||
Query data with the InfluxDB API using [Flux](#query-data-with-flux) or [InfluxQL](#query-data-with-influxql).
|
||||
|
||||
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
|
||||
|
||||
## Query data with Flux
|
||||
|
||||
For Flux queries, the `/api/v2/query` endpoint accepts `POST` HTTP requests. Use the following HTTP headers:
|
||||
- `Accept: application/csv`
|
||||
- `Content-type: application/vnd.flux`
|
||||
|
||||
If you have authentication enabled, provide your InfluxDB username and password with the `Authorization` header and `Token` schema. For example: `Authorization: Token username:password`.
|
||||
|
||||
|
||||
The following example queries Telegraf data using Flux:
|
||||
:
|
||||
|
||||
```bash
|
||||
$ curl -XPOST localhost:8086/api/v2/query -sS \
|
||||
-H 'Accept:application/csv' \
|
||||
-H 'Content-type:application/vnd.flux' \
|
||||
-d 'from(bucket:"telegraf")
|
||||
|> range(start:-5m)
|
||||
|> filter(fn:(r) => r._measurement == "cpu")'
|
||||
```
|
||||
Flux returns [annotated CSV](/influxdb/v2.0/reference/syntax/annotated-csv/):
|
||||
|
||||
```
|
||||
{,result,table,_start,_stop,_time,_value,_field,_measurement,cpu,host
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:19Z,4.152553004641827,usage_user,cpu,cpu-total,host1
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:29Z,7.608695652173913,usage_user,cpu,cpu-total,host1
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:39Z,2.9363988504310883,usage_user,cpu,cpu-total,host1
|
||||
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:49Z,6.915093159934975,usage_user,cpu,cpu-total,host1}
|
||||
```
|
||||
|
||||
The header row defines column labels for the table. The `cpu` [measurement](/enterprise_influxdb/v1.9/concepts/glossary/#measurement) has four points, each represented by one of the record rows. For example the first point has a [timestamp](/enterprise_influxdb/v1.9/concepts/glossary/#timestamp) of `2020-04-07T18:08:19`.
|
||||
|
||||
### Flux
|
||||
|
||||
Check out the [Get started with Flux](/influxdb/v2.0/query-data/get-started/) to learn more about building queries with Flux.
|
||||
For more information about querying data with the InfluxDB API using Flux, see the [API reference documentation](/enterprise_influxdb/v1.9/tools/api/#influxdb-2-0-api-compatibility-endpoints).
|
||||
|
||||
## Query data with InfluxQL
|
||||
|
||||
To perform an InfluxQL query, send a `GET` request to the `/query` endpoint, set the URL parameter `db` as the target database, and set the URL parameter `q` as your query.
|
||||
You can also use a `POST` request by sending the same parameters either as URL parameters or as part of the body with `application/x-www-form-urlencoded`.
|
||||
The example below uses the InfluxDB API to query the same database that you encountered in [Writing Data](/enterprise_influxdb/v1.9/guides/writing_data/).
|
||||
|
||||
```bash
|
||||
curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM \"cpu_load_short\" WHERE \"region\"='us-west'"
|
||||
```
|
||||
|
||||
InfluxDB returns JSON:
|
||||
|
||||
|
||||
```json
|
||||
{
|
||||
"results": [
|
||||
{
|
||||
"statement_id": 0,
|
||||
"series": [
|
||||
{
|
||||
"name": "cpu_load_short",
|
||||
"columns": [
|
||||
"time",
|
||||
"value"
|
||||
],
|
||||
"values": [
|
||||
[
|
||||
"2015-01-29T21:55:43.702900257Z",
|
||||
2
|
||||
],
|
||||
[
|
||||
"2015-01-29T21:55:43.702900257Z",
|
||||
0.55
|
||||
],
|
||||
[
|
||||
"2015-06-11T20:46:02Z",
|
||||
0.64
|
||||
]
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
> **Note:** Appending `pretty=true` to the URL enables pretty-printed JSON output.
|
||||
While this is useful for debugging or when querying directly with tools like `curl`, it is not recommended for production use as it consumes unnecessary network bandwidth.
|
||||
|
||||
### InfluxQL
|
||||
|
||||
Check out the [Data Exploration page](/enterprise_influxdb/v1.9/query_language/explore-data/) to get acquainted with InfluxQL.
|
||||
For more information about querying data with the InfluxDB API using InfluxQL, see the [API reference documentation](/enterprise_influxdb/v1.9/tools/api/#influxdb-1-x-http-endpoints).
|
|
@ -0,0 +1,419 @@
|
|||
---
|
||||
title: Rebalance InfluxDB Enterprise clusters
|
||||
description: Manually rebalance an InfluxDB Enterprise cluster.
|
||||
aliases:
|
||||
- /enterprise/v1.8/guides/rebalance/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Rebalance clusters
|
||||
weight: 19
|
||||
parent: Guides
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
This guide describes how to manually rebalance an InfluxDB Enterprise cluster.
|
||||
Rebalancing a cluster involves two primary goals:
|
||||
|
||||
* Evenly distribute
|
||||
[shards](/enterprise_influxdb/v1.9/concepts/glossary/#shard) across all data nodes in the
|
||||
cluster
|
||||
* Ensure that every
|
||||
shard is on *n* number of nodes, where *n* is determined by the retention policy's
|
||||
[replication factor](/enterprise_influxdb/v1.9/concepts/glossary/#replication-factor)
|
||||
|
||||
Rebalancing a cluster is essential for cluster health.
|
||||
Perform a rebalance if you add a new data node to your cluster.
|
||||
The proper rebalance path depends on the purpose of the new data node.
|
||||
If you added a data node to expand the disk size of the cluster or increase
|
||||
write throughput, follow the steps in
|
||||
[Rebalance Procedure 1](#rebalance-procedure-1-rebalance-a-cluster-to-create-space).
|
||||
If you added a data node to increase data availability for queries and query
|
||||
throughput, follow the steps in
|
||||
[Rebalance Procedure 2](#rebalance-procedure-2-rebalance-a-cluster-to-increase-availability).
|
||||
|
||||
### Requirements
|
||||
|
||||
The following sections assume that you already added a new data node to the
|
||||
cluster, and they use the
|
||||
[`influxd-ctl` tool](/enterprise_influxdb/v1.9/tools/influxd-ctl/) available on
|
||||
all meta nodes.
|
||||
|
||||
{{% warn %}}
|
||||
Before you begin, stop writing historical data to InfluxDB.
|
||||
Historical data have timestamps that occur at anytime in the past.
|
||||
Performing a rebalance while writing historical data can lead to data loss.
|
||||
{{% /warn %}}
|
||||
|
||||
## Rebalance Procedure 1: Rebalance a cluster to create space
|
||||
|
||||
For demonstration purposes, the next steps assume that you added a third
|
||||
data node to a previously two-data-node cluster that has a
|
||||
[replication factor](/enterprise_influxdb/v1.9/concepts/glossary/#replication-factor) of
|
||||
two.
|
||||
This rebalance procedure is applicable for different cluster sizes and
|
||||
replication factors, but some of the specific, user-provided values will depend
|
||||
on that cluster size.
|
||||
|
||||
Rebalance Procedure 1 focuses on how to rebalance a cluster after adding a
|
||||
data node to expand the total disk capacity of the cluster.
|
||||
In the next steps, you will safely move shards from one of the two original data
|
||||
nodes to the new data node.
|
||||
|
||||
### Step 1: Truncate Hot Shards
|
||||
|
||||
Hot shards are shards that are currently receiving writes.
|
||||
Performing any action on a hot shard can lead to data inconsistency within the
|
||||
cluster which requires manual intervention from the user.
|
||||
|
||||
To prevent data inconsistency, truncate hot shards before moving any shards
|
||||
across data nodes.
|
||||
The command below creates a new hot shard which is automatically distributed
|
||||
across all data nodes in the cluster, and the system writes all new points to
|
||||
that shard.
|
||||
All previous writes are now stored in cold shards.
|
||||
|
||||
```
|
||||
influxd-ctl truncate-shards
|
||||
```
|
||||
|
||||
The expected ouput of this command is:
|
||||
|
||||
```
|
||||
Truncated shards.
|
||||
```
|
||||
|
||||
Once you truncate the shards, you can work on redistributing the cold shards
|
||||
without the threat of data inconsistency in the cluster.
|
||||
Any hot or new shards are now evenly distributed across the cluster and require
|
||||
no further intervention.
|
||||
|
||||
### Step 2: Identify Cold Shards
|
||||
|
||||
In this step, you identify the cold shards that you will copy to the new data node
|
||||
and remove from one of the original two data nodes.
|
||||
|
||||
The following command lists every shard in our cluster:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output is similar to the items in the codeblock below:
|
||||
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
21 telegraf autogen 2 [...] 2017-01-26T18:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
24 telegraf autogen 2 [...] 2017-01-26T19:00:00Z [{5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
The sample output includes three shards.
|
||||
The first two shards are cold shards.
|
||||
The timestamp in the `End` column occurs in the past (assume that the current
|
||||
time is just after `2017-01-26T18:05:36.418734949Z`), and the shards' owners
|
||||
are the two original data nodes: `enterprise-data-01:8088` and
|
||||
`enterprise-data-02:8088`.
|
||||
The second shard is the truncated shard; truncated shards have an asterix (`*`)
|
||||
on the timestamp in the `End` column.
|
||||
|
||||
The third shard is the newly-created hot shard; the timestamp in the `End`
|
||||
column is in the future (again, assume that the current time is just after
|
||||
`2017-01-26T18:05:36.418734949Z`), and the shard's owners include one of the
|
||||
original data nodes (`enterprise-data-02:8088`) and the new data node
|
||||
(`enterprise-data-03:8088`).
|
||||
That hot shard and any subsequent shards require no attention during
|
||||
the rebalance process.
|
||||
|
||||
Identify the cold shards that you'd like to move from one of the original two
|
||||
data nodes to the new data node.
|
||||
Take note of the cold shard's `ID` (for example: `22`) and the TCP address of
|
||||
one of its owners in the `Owners` column (for example:
|
||||
`enterprise-data-01:8088`).
|
||||
|
||||
> **Note:**
|
||||
>
|
||||
Use the following command string to determine the size of the shards in
|
||||
your cluster:
|
||||
>
|
||||
find /var/lib/influxdb/data/ -mindepth 3 -type d -exec du -h {} \;
|
||||
>
|
||||
In general, we recommend moving larger shards to the new data node to increase the
|
||||
available disk space on the original data nodes.
|
||||
Users should note that moving shards will impact network traffic.
|
||||
|
||||
### Step 3: Copy Cold Shards
|
||||
|
||||
Next, copy the relevant cold shards to the new data node with the syntax below.
|
||||
Repeat this command for every cold shard that you'd like to move to the
|
||||
new data node.
|
||||
|
||||
```
|
||||
influxd-ctl copy-shard <source_TCP_address> <destination_TCP_address> <shard_ID>
|
||||
```
|
||||
|
||||
Where `source_TCP_address` is the address that you noted in step 2,
|
||||
`destination_TCP_address` is the TCP address of the new data node, and `shard_ID`
|
||||
is the ID of the shard that you noted in step 2.
|
||||
|
||||
The expected output of the command is:
|
||||
```
|
||||
Copied shard <shard_ID> from <source_TCP_address> to <destination_TCP_address>
|
||||
```
|
||||
|
||||
### Step 4: Confirm the Copied Shards
|
||||
|
||||
Confirm that the TCP address of the new data node appears in the `Owners` column
|
||||
for every copied shard:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output shows that the copied shard now has three owners:
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
In addition, verify that the copied shards appear in the new data node's shard
|
||||
directory and match the shards in the source data node's shard directory.
|
||||
Shards are located in
|
||||
`/var/lib/influxdb/data/<database>/<retention_policy>/<shard_ID>`.
|
||||
|
||||
Here's an example of the correct output for shard `22`:
|
||||
```
|
||||
# On the source data node (enterprise-data-01)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
|
||||
# On the new data node (enterprise-data-03)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
```
|
||||
|
||||
It is essential that every copied shard appears on the new data node both
|
||||
in the `influxd-ctl show-shards` output and in the shard directory.
|
||||
If a shard does not pass both of the tests above, please repeat step 3.
|
||||
|
||||
### Step 5: Remove Unnecessary Cold Shards
|
||||
|
||||
Next, remove the copied shard from the original data node with the command below.
|
||||
Repeat this command for every cold shard that you'd like to remove from one of
|
||||
the original data nodes.
|
||||
**Removing a shard is an irrecoverable, destructive action; please be
|
||||
cautious with this command.**
|
||||
|
||||
```
|
||||
influxd-ctl remove-shard <source_TCP_address> <shard_ID>
|
||||
```
|
||||
|
||||
Where `source_TCP_address` is the TCP address of the original data node and
|
||||
`shard_ID` is the ID of the shard that you noted in step 2.
|
||||
|
||||
The expected output of the command is:
|
||||
```
|
||||
Removed shard <shard_ID> from <source_TCP_address>
|
||||
```
|
||||
|
||||
### Step 6: Confirm the Rebalance
|
||||
|
||||
For every relevant shard, confirm that the TCP address of the new data node and
|
||||
only one of the original data nodes appears in the `Owners` column:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output shows that the copied shard now has only two owners:
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
22 telegraf autogen 2 [...] 2017-01-26T18:05:36.418734949Z* [{5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
That's it.
|
||||
You've successfully rebalanced your cluster; you expanded the available disk
|
||||
size on the original data nodes and increased the cluster's write throughput.
|
||||
|
||||
## Rebalance Procedure 2: Rebalance a cluster to increase availability
|
||||
|
||||
For demonstration purposes, the next steps assume that you added a third
|
||||
data node to a previously two-data-node cluster that has a
|
||||
[replication factor](/enterprise_influxdb/v1.9/concepts/glossary/#replication-factor) of
|
||||
two.
|
||||
This rebalance procedure is applicable for different cluster sizes and
|
||||
replication factors, but some of the specific, user-provided values will depend
|
||||
on that cluster size.
|
||||
|
||||
Rebalance Procedure 2 focuses on how to rebalance a cluster to improve availability
|
||||
and query throughput.
|
||||
In the next steps, you will increase the retention policy's replication factor and
|
||||
safely copy shards from one of the two original data nodes to the new data node.
|
||||
|
||||
### Step 1: Update the Retention Policy
|
||||
|
||||
[Update](/enterprise_influxdb/v1.9/query_language/manage-database/#modify-retention-policies-with-alter-retention-policy)
|
||||
every retention policy to have a replication factor of three.
|
||||
This step ensures that the system automatically distributes all newly-created
|
||||
shards across the three data nodes in the cluster.
|
||||
|
||||
The following query increases the replication factor to three.
|
||||
Run the query on any data node for each retention policy and database.
|
||||
Here, we use InfluxDB's [CLI](/enterprise_influxdb/v1.9/tools/use-influx/) to execute the query:
|
||||
|
||||
```
|
||||
> ALTER RETENTION POLICY "<retention_policy_name>" ON "<database_name>" REPLICATION 3
|
||||
>
|
||||
```
|
||||
|
||||
A successful `ALTER RETENTION POLICY` query returns no results.
|
||||
Use the
|
||||
[`SHOW RETENTION POLICIES` query](/enterprise_influxdb/v1.9/query_language/explore-schema/#show-retention-policies)
|
||||
to verify the new replication factor.
|
||||
|
||||
Example:
|
||||
```
|
||||
> SHOW RETENTION POLICIES ON "telegraf"
|
||||
|
||||
name duration shardGroupDuration replicaN default
|
||||
---- -------- ------------------ -------- -------
|
||||
autogen 0s 1h0m0s 3 #👍 true
|
||||
```
|
||||
|
||||
### Step 2: Truncate Hot Shards
|
||||
|
||||
Hot shards are shards that are currently receiving writes.
|
||||
Performing any action on a hot shard can lead to data inconsistency within the
|
||||
cluster which requires manual intervention from the user.
|
||||
|
||||
To prevent data inconsistency, truncate hot shards before copying any shards
|
||||
to the new data node.
|
||||
The command below creates a new hot shard which is automatically distributed
|
||||
across the three data nodes in the cluster, and the system writes all new points
|
||||
to that shard.
|
||||
All previous writes are now stored in cold shards.
|
||||
|
||||
```
|
||||
influxd-ctl truncate-shards
|
||||
```
|
||||
|
||||
The expected ouput of this command is:
|
||||
|
||||
```
|
||||
Truncated shards.
|
||||
```
|
||||
|
||||
Once you truncate the shards, you can work on distributing the cold shards
|
||||
without the threat of data inconsistency in the cluster.
|
||||
Any hot or new shards are now automatically distributed across the cluster and
|
||||
require no further intervention.
|
||||
|
||||
### Step 3: Identify Cold Shards
|
||||
|
||||
In this step, you identify the cold shards that you will copy to the new data node.
|
||||
|
||||
The following command lists every shard in your cluster:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output is similar to the items in the codeblock below:
|
||||
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
21 telegraf autogen 3 [...] 2017-01-26T18:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
22 telegraf autogen 3 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088}]
|
||||
24 telegraf autogen 3 [...] 2017-01-26T19:00:00Z [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
The sample output includes three shards.
|
||||
The first two shards are cold shards.
|
||||
The timestamp in the `End` column occurs in the past (assume that the current
|
||||
time is just after `2017-01-26T18:05:36.418734949Z`), and the shards' owners
|
||||
are the two original data nodes: `enterprise-data-01:8088` and
|
||||
`enterprise-data-02:8088`.
|
||||
The second shard is the truncated shard; truncated shards have an asterix (`*`)
|
||||
on the timestamp in the `End` column.
|
||||
|
||||
The third shard is the newly-created hot shard; the timestamp in the `End`
|
||||
column is in the future (again, assume that the current time is just after
|
||||
`2017-01-26T18:05:36.418734949Z`), and the shard's owners include all three
|
||||
data nodes: `enterprise-data-01:8088`, `enterprise-data-02:8088`, and
|
||||
`enterprise-data-03:8088`.
|
||||
That hot shard and any subsequent shards require no attention during
|
||||
the rebalance process.
|
||||
|
||||
Identify the cold shards that you'd like to copy from one of the original two
|
||||
data nodes to the new data node.
|
||||
Take note of the cold shard's `ID` (for example: `22`) and the TCP address of
|
||||
one of its owners in the `Owners` column (for example:
|
||||
`enterprise-data-01:8088`).
|
||||
|
||||
### Step 4: Copy Cold Shards
|
||||
|
||||
Next, copy the relevant cold shards to the new data node with the syntax below.
|
||||
Repeat this command for every cold shard that you'd like to move to the
|
||||
new data node.
|
||||
|
||||
```
|
||||
influxd-ctl copy-shard <source_TCP_address> <destination_TCP_address> <shard_ID>
|
||||
```
|
||||
|
||||
Where `source_TCP_address` is the address that you noted in step 3,
|
||||
`destination_TCP_address` is the TCP address of the new data node, and `shard_ID`
|
||||
is the ID of the shard that you noted in step 3.
|
||||
|
||||
The expected output of the command is:
|
||||
```
|
||||
Copied shard <shard_ID> from <source_TCP_address> to <destination_TCP_address>
|
||||
```
|
||||
|
||||
### Step 5: Confirm the Rebalance
|
||||
|
||||
Confirm that the TCP address of the new data node appears in the `Owners` column
|
||||
for every copied shard:
|
||||
|
||||
```
|
||||
influxd-ctl show-shards
|
||||
```
|
||||
|
||||
The expected output shows that the copied shard now has three owners:
|
||||
```
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas [...] End Owners
|
||||
22 telegraf autogen 3 [...] 2017-01-26T18:05:36.418734949Z* [{4 enterprise-data-01:8088} {5 enterprise-data-02:8088} {6 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
In addition, verify that the copied shards appear in the new data node's shard
|
||||
directory and match the shards in the source data node's shard directory.
|
||||
Shards are located in
|
||||
`/var/lib/influxdb/data/<database>/<retention_policy>/<shard_ID>`.
|
||||
|
||||
Here's an example of the correct output for shard `22`:
|
||||
```
|
||||
# On the source data node (enterprise-data-01)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
|
||||
# On the new data node (enterprise-data-03)
|
||||
|
||||
~# ls /var/lib/influxdb/data/telegraf/autogen/22
|
||||
000000001-000000001.tsm # 👍
|
||||
```
|
||||
|
||||
That's it.
|
||||
You've successfully rebalanced your cluster and increased data availability for
|
||||
queries and query throughput.
|
|
@ -0,0 +1,426 @@
|
|||
---
|
||||
title: Replace InfluxDB Enterprise cluster meta nodes and data nodes
|
||||
description: Replace meta and data nodes in an InfluxDB Enterprise cluster.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Replace cluster nodes
|
||||
weight: 10
|
||||
parent: Guides
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Nodes in an InfluxDB Enterprise cluster may need to be replaced at some point due to hardware needs, hardware issues, or something else entirely.
|
||||
This guide outlines processes for replacing both meta nodes and data nodes in an InfluxDB Enterprise cluster.
|
||||
|
||||
## Concepts
|
||||
Meta nodes manage and monitor both the uptime of nodes in the cluster as well as distribution of [shards](/enterprise_influxdb/v1.9/concepts/glossary/#shard) among nodes in the cluster.
|
||||
They hold information about which data nodes own which shards; information on which the
|
||||
[anti-entropy](/enterprise_influxdb/v1.9/administration/anti-entropy/) (AE) process depends.
|
||||
|
||||
Data nodes hold raw time-series data and metadata. Data shards are both distributed and replicated across data nodes in the cluster. The AE process runs on data nodes and references the shard information stored in the meta nodes to ensure each data node has the shards they need.
|
||||
|
||||
`influxd-ctl` is a CLI included in each meta node and is used to manage your InfluxDB Enterprise cluster.
|
||||
|
||||
## Scenarios
|
||||
|
||||
### Replace nodes in clusters with security enabled
|
||||
Many InfluxDB Enterprise clusters are configured with security enabled, forcing secure TLS encryption between all nodes in the cluster.
|
||||
Both `influxd-ctl` and `curl`, the command line tools used when replacing nodes, have options that facilitate the use of TLS.
|
||||
|
||||
#### `influxd-ctl -bind-tls`
|
||||
In order to manage your cluster over TLS, pass the `-bind-tls` flag with any `influxd-ctl` commmand.
|
||||
|
||||
> If using a self-signed certificate, pass the `-k` flag to skip certificate verification.
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl -bind-tls [-k] <command>
|
||||
|
||||
# Example
|
||||
influxd-ctl -bind-tls remove-meta enterprise-meta-02:8091
|
||||
```
|
||||
|
||||
#### `curl -k`
|
||||
|
||||
`curl` natively supports TLS/SSL connections, but if using a self-signed certificate, pass the `-k`/`--insecure` flag to allow for "insecure" SSL connections.
|
||||
|
||||
> Self-signed certificates are considered "insecure" due to their lack of a valid chain of authority. However, data is still encrypted when using self-signed certificates.
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
curl [-k, --insecure] <url>
|
||||
|
||||
# Example
|
||||
curl -k https://localhost:8091/status
|
||||
```
|
||||
|
||||
### Replace meta nodes in a functional cluster
|
||||
|
||||
If all meta nodes in the cluster are fully functional, simply follow the steps for [replacing meta nodes](#replace-meta-nodes-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
### Replace an unresponsive meta node
|
||||
|
||||
If replacing a meta node that is either unreachable or unrecoverable, you need to forcefully remove it from the meta cluster. Instructions for forcefully removing meta nodes are provided in the [step 2.2](#2-2-remove-the-non-leader-meta-node) of the [replacing meta nodes](#replace-meta-nodes-in-an-influxdb-enterprise-cluster) process.
|
||||
|
||||
### Replace responsive and unresponsive data nodes in a cluster
|
||||
|
||||
The process of replacing both responsive and unresponsive data nodes is the same. Simply follow the instructions for [replacing data nodes](#replace-data-nodes-in-an-influxdb-enterprise-cluster).
|
||||
|
||||
### Reconnect a data node with a failed disk
|
||||
|
||||
A disk drive failing is never a good thing, but it does happen, and when it does,
|
||||
all shards on that node are lost.
|
||||
|
||||
Often in this scenario, rather than replacing the entire host, you just need to replace the disk.
|
||||
Host information remains the same, but once started again, the `influxd` process doesn't know
|
||||
to communicate with the meta nodes so the AE process can't start the shard-sync process.
|
||||
|
||||
To resolve this, log in to a meta node and use the [`influxd-ctl update-data`](/enterprise_influxdb/v1.9/tools/influxd-ctl/#update-data) command
|
||||
to [update the failed data node to itself](#2-replace-the-old-data-node-with-the-new-data-node).
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl update-data <data-node-tcp-bind-address> <data-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl update-data enterprise-data-01:8088 enterprise-data-01:8088
|
||||
```
|
||||
|
||||
This will connect the `influxd` process running on the newly replaced disk to the cluster.
|
||||
The AE process will detect the missing shards and begin to sync data from other
|
||||
shards in the same shard group.
|
||||
|
||||
|
||||
## Replace meta nodes in an InfluxDB Enterprise cluster
|
||||
|
||||
[Meta nodes](/enterprise_influxdb/v1.9/concepts/clustering/#meta-nodes) together form a [Raft](https://raft.github.io/) cluster in which nodes elect a leader through consensus vote.
|
||||
The leader oversees the management of the meta cluster, so it is important to replace non-leader nodes before the leader node.
|
||||
The process for replacing meta nodes is as follows:
|
||||
|
||||
1. [Identify the leader node](#1-identify-the-leader-node)
|
||||
2. [Replace all non-leader nodes](#2-replace-all-non-leader-nodes)
|
||||
2.1. [Provision a new meta node](#2-1-provision-a-new-meta-node)
|
||||
2.2. [Remove the non-leader meta node](#2-2-remove-the-non-leader-meta-node)
|
||||
2.3. [Add the new meta node](#2-3-add-the-new-meta-node)
|
||||
2.4. [Confirm the meta node was added](#2-4-confirm-the-meta-node-was-added)
|
||||
2.5. [Remove and replace all other non-leader meta nodes](#2-5-remove-and-replace-all-other-non-leader-meta-nodes)
|
||||
3. [Replace the leader node](#3-replace-the-leader-node)
|
||||
3.1. [Kill the meta process on the leader node](#3-1-kill-the-meta-process-on-the-leader-node)
|
||||
3.2. [Remove and replace the old leader node](#3-2-remove-and-replace-the-old-leader-node)
|
||||
|
||||
### 1. Identify the leader node
|
||||
|
||||
Log into any of your meta nodes and run the following:
|
||||
|
||||
```bash
|
||||
curl -s localhost:8091/status | jq
|
||||
```
|
||||
|
||||
> Piping the command into `jq` is optional, but does make the JSON output easier to read.
|
||||
|
||||
The output will include information about the current meta node, the leader of the meta cluster, and a list of "peers" in the meta cluster.
|
||||
|
||||
```json
|
||||
{
|
||||
"nodeType": "meta",
|
||||
"leader": "enterprise-meta-01:8089",
|
||||
"httpAddr": "enterprise-meta-01:8091",
|
||||
"raftAddr": "enterprise-meta-01:8089",
|
||||
"peers": [
|
||||
"enterprise-meta-01:8089",
|
||||
"enterprise-meta-02:8089",
|
||||
"enterprise-meta-03:8089"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Identify the `leader` of the cluster. When replacing nodes in a cluster, non-leader nodes should be replaced _before_ the leader node.
|
||||
|
||||
### 2. Replace all non-leader nodes
|
||||
|
||||
#### 2.1. Provision a new meta node
|
||||
|
||||
[Provision and start a new meta node](/enterprise_influxdb/v1.9/installation/meta_node_installation/), but **do not** add it to the cluster yet.
|
||||
For this guide, the new meta node's hostname will be `enterprise-meta-04`.
|
||||
|
||||
#### 2.2. Remove the non-leader meta node
|
||||
|
||||
Now remove the non-leader node you are replacing by using the `influxd-ctl remove-meta` command and the TCP address of the meta node (ex. `enterprise-meta-02:8091`):
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl remove-meta <meta-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl remove-meta enterprise-meta-02:8091
|
||||
```
|
||||
|
||||
> Only use `remove-meta` if you want to permanently remove a meta node from a cluster.
|
||||
|
||||
<!-- -->
|
||||
|
||||
> **For unresponsive or unrecoverable meta nodes:**
|
||||
|
||||
>If the meta process is not running on the node you are trying to remove or the node is neither reachable nor recoverable, use the `-force` flag.
|
||||
When forcefully removing a meta node, you must also pass the `-tcpAddr` flag with the TCP and HTTP bind addresses of the node you are removing.
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl remove-meta -force -tcpAddr <meta-node-tcp-bind-address> <meta-node-http-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl remove-meta -force -tcpAddr enterprise-meta-02:8089 enterprise-meta-02:8091
|
||||
```
|
||||
|
||||
#### 2.3. Add the new meta node
|
||||
|
||||
Once the non-leader meta node has been removed, use `influxd-ctl add-meta` to replace it with the new meta node:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl add-meta <meta-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl add-meta enterprise-meta-04:8091
|
||||
```
|
||||
|
||||
You can also add a meta node remotely through another meta node:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl -bind <remote-meta-node-bind-address> add-meta <meta-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl -bind enterprise-meta-node-01:8091 add-meta enterprise-meta-node-04:8091
|
||||
```
|
||||
|
||||
>This command contacts the meta node running at `cluster-meta-node-01:8091` and adds a meta node to that meta node’s cluster.
|
||||
The added meta node has the hostname `cluster-meta-node-04` and runs on port `8091`.
|
||||
|
||||
#### 2.4. Confirm the meta node was added
|
||||
|
||||
Confirm the new meta-node has been added by running:
|
||||
|
||||
```bash
|
||||
influxd-ctl show
|
||||
```
|
||||
|
||||
The new meta node should appear in the output:
|
||||
|
||||
```bash
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 enterprise-data-01:8088 1.8.x-c1.8.x
|
||||
5 enterprise-data-02:8088 1.8.x-c1.8.x
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 1.8.x-c1.8.x
|
||||
enterprise-meta-03:8091 1.8.x-c1.8.x
|
||||
enterprise-meta-04:8091 1.8.x-c1.8.x # <-- The newly added meta node
|
||||
```
|
||||
|
||||
#### 2.5. Remove and replace all other non-leader meta nodes
|
||||
|
||||
**If replacing only one meta node, no further action is required.**
|
||||
If replacing others, repeat steps [2.1-2.4](#2-1-provision-a-new-meta-node) for all non-leader meta nodes one at a time.
|
||||
|
||||
### 3. Replace the leader node
|
||||
|
||||
As non-leader meta nodes are removed and replaced, the leader node oversees the replication of data to each of the new meta nodes.
|
||||
Leave the leader up and running until at least two of the new meta nodes are up, running and healthy.
|
||||
|
||||
#### 3.1 - Kill the meta process on the leader node
|
||||
|
||||
Log into the leader meta node and kill the meta process.
|
||||
|
||||
```bash
|
||||
# List the running processes and get the
|
||||
# PID of the 'influx-meta' process
|
||||
ps aux
|
||||
|
||||
# Kill the 'influx-meta' process
|
||||
kill <PID>
|
||||
```
|
||||
|
||||
Once killed, the meta cluster will elect a new leader using the [raft consensus algorithm](https://raft.github.io/).
|
||||
Confirm the new leader by running:
|
||||
|
||||
```bash
|
||||
curl localhost:8091/status | jq
|
||||
```
|
||||
|
||||
#### 3.2 - Remove and replace the old leader node
|
||||
|
||||
Remove the old leader node and replace it by following steps [2.1-2.4](#2-1-provision-a-new-meta-node).
|
||||
The minimum number of meta nodes you should have in your cluster is 3.
|
||||
|
||||
## Replace data nodes in an InfluxDB Enterprise cluster
|
||||
|
||||
[Data nodes](/enterprise_influxdb/v1.9/concepts/clustering/#data-nodes) house all raw time series data and metadata.
|
||||
The process of replacing data nodes is as follows:
|
||||
|
||||
1. [Provision a new data node](#1-provision-a-new-data-node)
|
||||
2. [Replace the old data node with the new data node](#2-replace-the-old-data-node-with-the-new-data-node)
|
||||
3. [Confirm the data node was added](#3-confirm-the-data-node-was-added)
|
||||
4. [Check the copy-shard-status](#4-check-the-copy-shard-status)
|
||||
|
||||
### 1. Provision a new data node
|
||||
|
||||
[Provision and start a new data node](/enterprise_influxdb/v1.9/installation/data_node_installation/), but **do not** add it to your cluster yet.
|
||||
|
||||
### 2. Replace the old data node with the new data node
|
||||
|
||||
Log into any of your cluster's meta nodes and use `influxd-ctl update-data` to replace the old data node with the new data node:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
influxd-ctl update-data <old-node-tcp-bind-address> <new-node-tcp-bind-address>
|
||||
|
||||
# Example
|
||||
influxd-ctl update-data enterprise-data-01:8088 enterprise-data-03:8088
|
||||
```
|
||||
|
||||
### 3. Confirm the data node was added
|
||||
|
||||
Confirm the new data node has been added by running:
|
||||
|
||||
```bash
|
||||
influxd-ctl show
|
||||
```
|
||||
|
||||
The new data node should appear in the output:
|
||||
|
||||
```bash
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 enterprise-data-03:8088 1.8.x-c1.8.x # <-- The newly added data node
|
||||
5 enterprise-data-02:8088 1.8.x-c1.8.x
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 1.8.x-c1.8.x
|
||||
enterprise-meta-02:8091 1.8.x-c1.8.x
|
||||
enterprise-meta-03:8091 1.8.x-c1.8.x
|
||||
```
|
||||
|
||||
Inspect your cluster's shard distribution with `influxd-ctl show-shards`.
|
||||
Shards will immediately reflect the new address of the node.
|
||||
|
||||
```bash
|
||||
influxd-ctl show-shards
|
||||
|
||||
Shards
|
||||
==========
|
||||
ID Database Retention Policy Desired Replicas Shard Group Start End Expires Owners
|
||||
3 telegraf autogen 2 2 2018-03-19T00:00:00Z 2018-03-26T00:00:00Z [{5 enterprise-data-02:8088} {4 enterprise-data-03:8088}]
|
||||
1 _internal monitor 2 1 2018-03-22T00:00:00Z 2018-03-23T00:00:00Z 2018-03-30T00:00:00Z [{5 enterprise-data-02:8088}]
|
||||
2 _internal monitor 2 1 2018-03-22T00:00:00Z 2018-03-23T00:00:00Z 2018-03-30T00:00:00Z [{4 enterprise-data-03:8088}]
|
||||
4 _internal monitor 2 3 2018-03-23T00:00:00Z 2018-03-24T00:00:00Z 2018-03-01T00:00:00Z [{5 enterprise-data-02:8088}]
|
||||
5 _internal monitor 2 3 2018-03-23T00:00:00Z 2018-03-24T00:00:00Z 2018-03-01T00:00:00Z [{4 enterprise-data-03:8088}]
|
||||
6 foo autogen 2 4 2018-03-19T00:00:00Z 2018-03-26T00:00:00Z [{5 enterprise-data-02:8088} {4 enterprise-data-03:8088}]
|
||||
```
|
||||
|
||||
Within the duration defined by [`anti-entropy.check-interval`](/enterprise_influxdb/v1.9/administration/config-data-nodes#check-interval-10m),
|
||||
the AE service will begin copying shards from other shard owners to the new node.
|
||||
The time it takes for copying to complete is determined by the number of shards copied and how much data is stored in each.
|
||||
|
||||
### 4. Check the `copy-shard-status`
|
||||
|
||||
Check on the status of the copy-shard process with:
|
||||
|
||||
```bash
|
||||
influxd-ctl copy-shard-status
|
||||
```
|
||||
|
||||
The output will show all currently running copy-shard processes.
|
||||
|
||||
```bash
|
||||
Source Dest Database Policy ShardID TotalSize CurrentSize StartedAt
|
||||
enterprise-data-02:8088 enterprise-data-03:8088 telegraf autogen 3 119624324 119624324 2018-04-17 23:45:09.470696179 +0000 UTC
|
||||
```
|
||||
|
||||
> **Important:** If replacing other data nodes in the cluster, make sure shards are completely copied from nodes in the same shard group before replacing the other nodes.
|
||||
View the [Anti-entropy](/enterprise_influxdb/v1.9/administration/anti-entropy/#concepts) documentation for important information regarding anti-entropy and your database's replication factor.
|
||||
|
||||
## Troubleshoot
|
||||
|
||||
### Cluster commands result in timeout without error
|
||||
|
||||
In some cases, commands used to add or remove nodes from your cluster
|
||||
timeout, but don't return an error.
|
||||
|
||||
```bash
|
||||
add-data: operation timed out with error:
|
||||
```
|
||||
|
||||
#### Check your InfluxDB user permissions
|
||||
|
||||
In order to add or remove nodes to or from a cluster, your user must have `AddRemoveNode` permissions.
|
||||
Attempting to manage cluster nodes without the appropriate permissions results
|
||||
in a timeout with no accompanying error.
|
||||
|
||||
To check user permissions, log in to one of your meta nodes and `curl` the `/user` API endpoint:
|
||||
|
||||
```bash
|
||||
curl localhost:8091/user
|
||||
```
|
||||
|
||||
You can also check the permissions of a specific user by passing the username with the `name` parameter:
|
||||
|
||||
```bash
|
||||
# Pattern
|
||||
curl localhost:8091/user?name=<username>
|
||||
|
||||
# Example
|
||||
curl localhost:8091/user?name=bob
|
||||
```
|
||||
|
||||
The JSON output will include user information and permissions:
|
||||
|
||||
```json
|
||||
"users": [
|
||||
{
|
||||
"name": "bob",
|
||||
"hash": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
|
||||
"permissions": {
|
||||
"": [
|
||||
"ViewAdmin",
|
||||
"ViewChronograf",
|
||||
"CreateDatabase",
|
||||
"CreateUserAndRole",
|
||||
"DropDatabase",
|
||||
"DropData",
|
||||
"ReadData",
|
||||
"WriteData",
|
||||
"ManageShard",
|
||||
"ManageContinuousQuery",
|
||||
"ManageQuery",
|
||||
"ManageSubscription",
|
||||
"Monitor"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
_In the output above, `bob` does not have the required `AddRemoveNode` permissions
|
||||
and would not be able to add or remove nodes from the cluster._
|
||||
|
||||
#### Check the network connection between nodes
|
||||
|
||||
Something may be interrupting the network connection between nodes.
|
||||
To check, `ping` the server or node you're trying to add or remove.
|
||||
If the ping is unsuccessful, something in the network is preventing communication.
|
||||
|
||||
```bash
|
||||
ping enterprise-data-03:8088
|
||||
```
|
||||
|
||||
_If pings are unsuccessful, be sure to ping from other meta nodes as well to determine
|
||||
if the communication issues are unique to specific nodes._
|
|
@ -0,0 +1,190 @@
|
|||
---
|
||||
title: Write data with the InfluxDB API
|
||||
description: >
|
||||
Use the command line interface (CLI) to write data into InfluxDB with the API.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 10
|
||||
parent: Guides
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/guides/writing_data/
|
||||
v2: /influxdb/v2.0/write-data/
|
||||
---
|
||||
|
||||
Write data into InfluxDB using the [command line interface](/enterprise_influxdb/v1.9/tools/use-influx/), [client libraries](/enterprise_influxdb/v1.9/clients/api/), and plugins for common data formats such as [Graphite](/enterprise_influxdb/v1.9/write_protocols/graphite/).
|
||||
|
||||
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
|
||||
|
||||
### Create a database using the InfluxDB API
|
||||
|
||||
To create a database send a `POST` request to the `/query` endpoint and set the URL parameter `q` to `CREATE DATABASE <new_database_name>`.
|
||||
The example below sends a request to InfluxDB running on `localhost` and creates the `mydb` database:
|
||||
|
||||
```bash
|
||||
curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE mydb"
|
||||
```
|
||||
|
||||
### Write data using the InfluxDB API
|
||||
|
||||
The InfluxDB API is the primary means of writing data into InfluxDB.
|
||||
|
||||
- To **write to a database using the InfluxDB 1.8 API**, send `POST` requests to the `/write` endpoint. For example, to write a single point to the `mydb` database.
|
||||
The data consists of the [measurement](/enterprise_influxdb/v1.9/concepts/glossary/#measurement) `cpu_load_short`, the [tag keys](/enterprise_influxdb/v1.9/concepts/glossary/#tag-key) `host` and `region` with the [tag values](/enterprise_influxdb/v1.9/concepts/glossary/#tag-value) `server01` and `us-west`, the [field key](/enterprise_influxdb/v1.9/concepts/glossary/#field-key) `value` with a [field value](/enterprise_influxdb/v1.9/concepts/glossary/#field-value) of `0.64`, and the [timestamp](/enterprise_influxdb/v1.9/concepts/glossary/#timestamp) `1434055562000000000`.
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=mydb'
|
||||
--data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
|
||||
```
|
||||
|
||||
- To **write to a database using the InfluxDB 2.0 API (compatible with InfluxDB 1.8+)**, send `POST` requests to the [`/api/v2/write` endpoint](/enterprise_influxdb/v1.9/tools/api/#api-v2-write-http-endpoint):
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/api/v2/write?bucket=db/rp&precision=ns' \
|
||||
--header 'Authorization: Token username:password' \
|
||||
--data-raw 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
|
||||
```
|
||||
|
||||
When writing points, you must specify an existing database in the `db` query parameter.
|
||||
Points will be written to `db`'s default retention policy if you do not supply a retention policy via the `rp` query parameter.
|
||||
See the [InfluxDB API Reference](/enterprise_influxdb/v1.9/tools/api/#write-http-endpoint) documentation for a complete list of the available query parameters.
|
||||
|
||||
The body of the POST or [InfluxDB line protocol](/enterprise_influxdb/v1.9/concepts/glossary/#influxdb-line-protocol) contains the time series data that you want to store. Data includes:
|
||||
|
||||
- **Measurement (required)**
|
||||
- **Tags**: Strictly speaking, tags are optional but most series include tags to differentiate data sources and to make querying both easy and efficient.
|
||||
Both tag keys and tag values are strings.
|
||||
- **Fields (required)**: Field keys are required and are always strings, and, [by default](/enterprise_influxdb/v1.9/write_protocols/line_protocol_reference/#data-types), field values are floats.
|
||||
- **Timestamp**: Supplied at the end of the line in Unix time in nanoseconds since January 1, 1970 UTC - is optional. If you do not specify a timestamp, InfluxDB uses the server's local nanosecond timestamp in Unix epoch.
|
||||
Time in InfluxDB is in UTC format by default.
|
||||
|
||||
> **Note:** Avoid using the following reserved keys: `_field`, `_measurement`, and `time`. If reserved keys are included as a tag or field key, the associated point is discarded.
|
||||
|
||||
### Configure gzip compression
|
||||
|
||||
InfluxDB supports gzip compression. To reduce network traffic, consider the following options:
|
||||
|
||||
* To accept compressed data from InfluxDB, add the `Accept-Encoding: gzip` header to InfluxDB API requests.
|
||||
|
||||
* To compress data before sending it to InfluxDB, add the `Content-Encoding: gzip` header to InfluxDB API requests.
|
||||
|
||||
For details about enabling gzip for client libraries, see your client library documentation.
|
||||
|
||||
#### Enable gzip compression in the Telegraf InfluxDB output plugin
|
||||
|
||||
* In the Telegraf configuration file (telegraf.conf), under [[outputs.influxdb]], change
|
||||
`content_encoding = "identity"` (default) to `content_encoding = "gzip"`
|
||||
|
||||
>**Note**
|
||||
Writes to InfluxDB 2.x [[outputs.influxdb_v2]] are configured to compress content in gzip format by default.
|
||||
|
||||
### Writing multiple points
|
||||
|
||||
Post multiple points to multiple series at the same time by separating each point with a new line.
|
||||
Batching points in this manner results in much higher performance.
|
||||
|
||||
The following example writes three points to the database `mydb`.
|
||||
The first point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02` and has the server's local timestamp.
|
||||
The second point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02,region=us-west` and has the specified timestamp `1422568543702900257`.
|
||||
The third point has the same specified timestamp as the second point, but it is written to the series with the measurement `cpu_load_short` and tag set `direction=in,host=server01,region=us-west`.
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67
|
||||
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
|
||||
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257'
|
||||
```
|
||||
|
||||
### Writing points from a file
|
||||
|
||||
Write points from a file by passing `@filename` to `curl`.
|
||||
The data in the file should follow the [InfluxDB line protocol syntax](/enterprise_influxdb/v1.9/write_protocols/write_syntax/).
|
||||
|
||||
Example of a properly-formatted file (`cpu_data.txt`):
|
||||
|
||||
```txt
|
||||
cpu_load_short,host=server02 value=0.67
|
||||
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
|
||||
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257
|
||||
```
|
||||
|
||||
Write the data in `cpu_data.txt` to the `mydb` database with:
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary @cpu_data.txt
|
||||
```
|
||||
|
||||
> **Note:** If your data file has more than 5,000 points, it may be necessary to split that file into several files in order to write your data in batches to InfluxDB.
|
||||
By default, the HTTP request times out after five seconds.
|
||||
InfluxDB will still attempt to write the points after that time out but there will be no confirmation that they were successfully written.
|
||||
|
||||
### Schemaless Design
|
||||
|
||||
InfluxDB is a schemaless database.
|
||||
You can add new measurements, tags, and fields at any time.
|
||||
Note that if you attempt to write data with a different type than previously used (for example, writing a string to a field that previously accepted integers), InfluxDB will reject those data.
|
||||
|
||||
### A note on REST
|
||||
|
||||
InfluxDB uses HTTP solely as a convenient and widely supported data transfer protocol.
|
||||
|
||||
Modern web APIs have settled on REST because it addresses a common need.
|
||||
As the number of endpoints grows the need for an organizing system becomes pressing.
|
||||
REST is the industry agreed style for organizing large numbers of endpoints.
|
||||
This consistency is good for those developing and consuming the API: everyone involved knows what to expect.
|
||||
|
||||
REST, however, is a convention.
|
||||
InfluxDB makes do with three API endpoints.
|
||||
This simple, easy to understand system uses HTTP as a transfer method for [InfluxQL](/enterprise_influxdb/v1.9/query_language/spec/).
|
||||
The InfluxDB API makes no attempt to be RESTful.
|
||||
|
||||
### HTTP response summary
|
||||
|
||||
* 2xx: If your write request received `HTTP 204 No Content`, it was a success!
|
||||
* 4xx: InfluxDB could not understand the request.
|
||||
* 5xx: The system is overloaded or significantly impaired.
|
||||
|
||||
#### Examples
|
||||
|
||||
##### Writing a float to a field that previously accepted booleans
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=true'
|
||||
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=5'
|
||||
```
|
||||
|
||||
returns:
|
||||
|
||||
```bash
|
||||
HTTP/1.1 400 Bad Request
|
||||
Content-Type: application/json
|
||||
Request-Id: [...]
|
||||
X-Influxdb-Version: 1.4.x
|
||||
Date: Wed, 01 Mar 2017 19:38:01 GMT
|
||||
Content-Length: 150
|
||||
|
||||
{"error":"field type conflict: input field \"booleanonly\" on measurement \"tobeornottobe\" is type float, already exists as type boolean dropped=1"}
|
||||
```
|
||||
|
||||
##### Writing a point to a database that doesn't exist
|
||||
|
||||
```bash
|
||||
curl -i -XPOST 'http://localhost:8086/write?db=atlantis' --data-binary 'liters value=10'
|
||||
```
|
||||
|
||||
returns:
|
||||
|
||||
```bash
|
||||
HTTP/1.1 404 Not Found
|
||||
Content-Type: application/json
|
||||
Request-Id: [...]
|
||||
X-Influxdb-Version: 1.4.x
|
||||
Date: Wed, 01 Mar 2017 19:38:35 GMT
|
||||
Content-Length: 45
|
||||
|
||||
{"error":"database not found: \"atlantis\""}
|
||||
```
|
||||
|
||||
### Next steps
|
||||
|
||||
Now that you know how to write data with the InfluxDB API, discover how to query them with the [Querying data](/enterprise_influxdb/v1.9/guides/querying_data/) guide!
|
||||
For more information about writing data with the InfluxDB API, please see the [InfluxDB API reference](/enterprise_influxdb/v1.9/tools/api/#write-http-endpoint).
|
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
title: Introducing InfluxDB Enterprise
|
||||
description: Tasks required to get up and running with InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/introduction/
|
||||
weight: 2
|
||||
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Introduction
|
||||
|
||||
---
|
||||
|
||||
{{< children >}}
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
title: InfluxDB Enterprise downloads
|
||||
description: Download InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise/v1.8/introduction/download/
|
||||
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Downloads
|
||||
weight: 101
|
||||
parent: Introduction
|
||||
---
|
||||
|
||||
You must have a valid license to run InfluxDB Enterprise.
|
||||
You may obtain a 14-day demo license via the [InfluxDB Enterprise portal](https://portal.influxdata.com/users/new).
|
||||
|
||||
If you have purchased a license or already obtained a demo license,
|
||||
log in to the [InfluxDB Enterprise portal](https://portal.influxdata.com/users/sign_in)
|
||||
to get your license key and download URLs.
|
||||
|
||||
See the [installation documentation](/enterprise_influxdb/v1.9/install-and-deploy/)
|
||||
for more information about getting started.
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
title: Getting started with InfluxDB Enterprise
|
||||
description: Set up your cluster as a data source in Chronograf.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/introduction/getting_started/
|
||||
- /enterprise/v1.8/introduction/getting_started/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Getting started
|
||||
weight: 104
|
||||
parent: Introduction
|
||||
---
|
||||
|
||||
After you successfully [install and set up](/enterprise_influxdb/v1.9/install-and-deploy/installation/) InfluxDB Enterprise, learn how to [monitor your InfluxDB Enterprise clusters](/{{< latest "chronograf" >}}/guides/monitoring-influxenterprise-clusters) with Chronograf, InfluxDB, and Telegraf.
|
||||
|
||||
### Where to from here?
|
||||
|
||||
- [Review key concepts](/enterprise_influxdb/v1.9/concepts/) and check out the [Enterprise features](/enterprise_influxdb/v1.9/features/).
|
||||
- Learn how to [administer your cluster](/enterprise_influxdb/v1.9/administration/), including how to backup and restore clusters, configure clusters, log and trace clusters, manage security, and manage subscriptions.
|
||||
- Find [Enterprise guides](/enterprise_influxdb/v1.9/guides/) on a variety of topics, such as how to downsample and retain data, rebalance InfluxDB Enterprise clusters, use fine-grained authorization, and more!
|
||||
- Explore the [InfluxQL](/enterprise_influxdb/v1.9/query_language/) and [Flux](/enterprise_influxdb/v1.9/flux/) languages.
|
||||
- Learn about [InfluxDB line protocol](/enterprise_influxdb/v1.9/write_protocols/) and other [supported protocols](/enterprise_influxdb/v1.9/supported_protocols/).
|
|
@ -0,0 +1,29 @@
|
|||
---
|
||||
title: Install and deploy InfluxDB Enterprise
|
||||
description: Install InfluxDB Enterprise to on-premise or cloud providers, including Google Cloud Platform, Amazon Web Services, and Azure.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/deploying/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Install and deploy
|
||||
weight: 30
|
||||
parent: Introduction
|
||||
---
|
||||
|
||||
Install or deploy your InfluxDB Enterprise cluster in the environment of your choice:
|
||||
|
||||
- Your own environment
|
||||
- Your cloud provider
|
||||
|
||||
## Your own environment
|
||||
|
||||
Learn how to [install a cluster in your own environment](/enterprise_influxdb/v1.9/install-and-deploy/installation/).
|
||||
|
||||
## Your cloud provider
|
||||
|
||||
Learn how to deploy a cluster on the cloud provider of your choice:
|
||||
|
||||
- [GCP](/enterprise_influxdb/v1.9/install-and-deploy/deploying/google-cloud-platform/)
|
||||
- [AWS](/enterprise_influxdb/v1.9/install-and-deploy/deploying/aws/)
|
||||
- [Azure](/enterprise_influxdb/v1.9/install-and-deploy/deploying/azure/)
|
|
@ -0,0 +1,20 @@
|
|||
---
|
||||
title: Deploy InfluxDB Enterprise clusters
|
||||
description: >
|
||||
Install InfluxDB Enterprise to a cloud provider of your choice, including Google Cloud Platform, Amazon Web Services, and Azure.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/other-options/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/deploying/index/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Deploy in cloud
|
||||
identifier: deploy-in-cloud-enterprise
|
||||
weight: 30
|
||||
parent: Install and deploy
|
||||
---
|
||||
|
||||
Deploy InfluxDB Enterprise clusters on the cloud provider of your choice.
|
||||
|
||||
> **Note:** To install in your own environment, see [Install an InfluxDB Enterprise cluster in your own environment](/enterprise_influxdb/v1.9/install-and-deploy/installation/).
|
||||
|
||||
{{< children hlevel="h2" >}}
|
|
@ -0,0 +1,19 @@
|
|||
---
|
||||
title: Deploy InfluxDB Enterprise clusters on Amazon Web Services
|
||||
description: Deploy InfluxDB Enterprise clusters on Amazon Web Services (AWS).
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/other-options/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/aws/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/deploying/aws/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: AWS
|
||||
identifier: deploy-on-aws
|
||||
weight: 30
|
||||
parent: deploy-in-cloud-enterprise
|
||||
---
|
||||
|
||||
The following articles detail how to deploy InfluxDB clusters in AWS:
|
||||
|
||||
- [Deploy an InfluxDB Enterprise cluster on Amazon Web Services](/enterprise_influxdb/v1.9/install-and-deploy/deploying/aws/setting-up-template)
|
||||
- [AWS configuration options](/enterprise_influxdb/v1.9/install-and-deploy/deploying/aws/config-options)
|
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
title: AWS configuration options
|
||||
description: >
|
||||
Configuration options when deploying InfluxDB Enterprise on Amazon Web Services (AWS).
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/aws/config-options/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/aws/config-options/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/deploying/aws/config-options/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: AWS configuration options
|
||||
weight: 30
|
||||
parent: deploy-on-aws
|
||||
---
|
||||
When deploying InfluxDB Enterprise on AWS using the template described in [Deploy an InfluxDB Enterprise cluster on Amazon Web Services](/enterprise_influxdb/v1.9/install-and-deploy/aws/setting-up-template), the following configuration options are available:
|
||||
|
||||
- **VPC ID**: The VPC ID of your existing Virtual Private Cloud (VPC).
|
||||
- **Subnets**: A list of SubnetIds in your Virtual Private Cloud (VPC) where nodes will be created. The subnets must be in the same order as the availability zones they reside in. For a list of which availability zones correspond to which subnets, see the [Subnets section of your VPC dashboard](https://console.aws.amazon.com/vpc/home?region=us-east-1#subnets:sort=SubnetId).
|
||||
- **Availability Zones**: Availability zones to correspond with your subnets above. The availability zones must be in the same order as their related subnets. For a list of which availability zones correspond to which subnets, see the [Subnets section of your VPC dashboard](https://console.aws.amazon.com/vpc/home?region=us-east-1#subnets:sort=SubnetId).
|
||||
- **SSH Key Name**: An existing key pair to enable SSH access for the instances. For details on how to create a key pair, see [Creating a Key Pair Using Amazon EC2](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair).
|
||||
- **InfluxDB ingress CIDR**: The IP address range that can be used to connect to the InfluxDB API endpoint. To allow all traffic, enter 0.0.0.0/0.
|
||||
- **SSH Access CIDR**: The IP address range that can be used to SSH into the EC2 instances. To allow all traffic, enter 0.0.0.0/0.
|
||||
- **InfluxDB Enterprise License Key**: Your InfluxDB Enterprise license key. Applies only to BYOL.
|
||||
- **InfluxDB Administrator Username**: Your InfluxDB administrator username. Applies only to BYOL.
|
||||
- **InfluxDB Administrator Password**: Your InfluxDB administrator password. Applies only to BYOL.
|
||||
- **InfluxDB Enterprise Version**: The version of InfluxDB. Defaults to current version. <!--Is this going to be taken out?-->
|
||||
- **Telegraf Version**: The version of Telegraf. Defaults to current version.
|
||||
- **InfluxDB Data Node Disk Size**: The size in GB of the EBS io1 volume each data node. Defaults to 250.
|
||||
- **InfluxDB Data Node Disk IOPS**: The IOPS of the EBS io1 volume on each data node. Defaults to 1000.
|
||||
- **DataNodeInstanceType**: The instance type of the data node. Defaults to m5.large.
|
||||
- **MetaNodeInstanceType**: The instance type of the meta node. Defaults to t3.small.
|
||||
- **MonitorInstanceType**: The instance type of the monitor node. Defaults to t3.large.
|
|
@ -0,0 +1,58 @@
|
|||
---
|
||||
title: Deploy an InfluxDB Enterprise cluster on Amazon Web Services
|
||||
description: Deploy an InfluxDB Enterprise cluster on Amazon Web Services (AWS).
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/aws/setting-up-template/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/deploying/aws/setting-up-template/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Deploy on Amazon Web Services
|
||||
weight: 20
|
||||
parent: deploy-on-aws
|
||||
---
|
||||
|
||||
Follow these steps to deploy an InfluxDB Enterprise cluster on AWS.
|
||||
|
||||
## Step 1: Specify template
|
||||
|
||||
After you complete the marketplace flow, you'll be directed to the Cloud Formation Template.
|
||||
|
||||
1. In the Prepare template section, select **Template is ready**.
|
||||
2. In the Specify template section, the **Amazon S3 URL** field is automatically populated with either the BYOL or integrated billing template, depending on the option you selected in the marketplace.
|
||||
3. Click **Next**.
|
||||
|
||||
## Step 2: Specify stack details
|
||||
|
||||
1. In the Stack name section, enter a name for your stack.
|
||||
2. Complete the Network Configuration section:
|
||||
- **VPC ID**: Click the dropdown menu to fill in your VPC.
|
||||
- **Subnets**: Select three subnets.
|
||||
- **Availability Zones**: Select three availability zones to correspond with your subnets above. The availability zones must be in the same order as their related subnets. For a list of which availability zones correspond to which subnets, see the [Subnets section of your VPC dashboard](https://console.aws.amazon.com/vpc/home?region=us-east-1#subnets:sort=SubnetId).
|
||||
- **SSH Key Name**: Select an existing key pair to enable SSH access for the instances. For details on how to create a key pair, see [Creating a Key Pair Using Amazon EC2](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair).
|
||||
- **InfluxDB ingress CIDR**: Enter the IP address range that can be used to connect to the InfluxDB API endpoint. To allow all traffic, enter 0.0.0.0/0.
|
||||
- **SSH Access CIDR**: Enter the IP address range that can be used to SSH into the EC2 instances. To allow all traffic, enter 0.0.0.0/0.
|
||||
3. Complete the **InfluxDB Configuration** section:
|
||||
- **InfluxDB Enterprise License Key**: Applies only to BYOL. Enter your InfluxDB Enterprise license key.
|
||||
- **InfluxDB Administrator Username**: Applies only to BYOL. Enter your InfluxDB administrator username.
|
||||
- **InfluxDB Administrator Password**: Applies only to BYOL. Enter your InfluxDB administrator password.
|
||||
- **InfluxDB Enterprise Version**: Defaults to current version. <!--IS this going to be taken out?-->
|
||||
- **Telegraf Version**: Defaults to current version.
|
||||
- **InfluxDB Data Node Disk Size**: The size in GB of the EBS io1 volume each data node. Defaults to 250.
|
||||
- **InfluxDB Data Node Disk IOPS**: The IOPS of the EBS io1 volume on each data node. Defaults to 1000.
|
||||
4. Review the **Other Parameters** section and modify if needed. The fields in this section are all automatically populated and shouldn't require changes.
|
||||
- **DataNodeInstanceType**: Defaults to m5.large.
|
||||
- **MetaNodeInstanceType**: Defaults to t3.small.
|
||||
- **MonitorInstanceType**: Defaults to t3.large.
|
||||
5. Click **Next**.
|
||||
|
||||
## Step 3: Configure stack options
|
||||
|
||||
1. In the **Tags** section, enter any key-value pairs you want to apply to resources in the stack.
|
||||
2. Review the **Permissions** and **Advanced options** sections. In most cases, there's no need to modify anything in these sections.
|
||||
3. Click **Next**.
|
||||
|
||||
## Step 4: Review
|
||||
|
||||
1. Review the configuration options for all of the above sections.
|
||||
2. In the **Capabilities** section, check the box acknowledging that AWS CloudFormation might create IAM resources.
|
||||
3. Click **Create stack**.
|
|
@ -0,0 +1,76 @@
|
|||
---
|
||||
title: Deploy an InfluxDB Enterprise cluster on Azure Cloud Platform
|
||||
description: >
|
||||
Deploy an InfluxDB Enterprise cluster on Microsoft Azure cloud computing service.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/azure/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/deploying/azure/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Azure
|
||||
weight: 20
|
||||
parent: deploy-in-cloud-enterprise
|
||||
---
|
||||
|
||||
For deploying InfluxDB Enterprise clusters on Microsoft Azure cloud computing service, InfluxData provides an [InfluxDB Enterprise application](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/influxdata.influxdb-enterprise-cluster) on the [Azure Marketplace](https://azuremarketplace.microsoft.com/) that makes the installation and setup process easy and straightforward. Clusters are deployed through an Azure Marketplace subscription and are ready for production. Billing occurs through your Azure subscription.
|
||||
|
||||
> Please submit issues and feature requests for the Azure Marketplace deployment [through the related GitHub repository](https://github.com/influxdata/azure-resource-manager-influxdb-enterprise/issues/new) (requires a GitHub account) or by contacting [InfluxData Support](mailto:support@influxdata.com).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
This guide requires the following:
|
||||
|
||||
- Microsoft Azure account with access to the [Azure Marketplace](https://azuremarketplace.microsoft.com/).
|
||||
- SSH access to cluster instances.
|
||||
|
||||
To deploy InfluxDB Enterprise clusters on platforms other than Azure, see [Deploy InfluxDB Enterprise](/enterprise_influxdb/v1.9/install-and-deploy/).
|
||||
|
||||
## Deploy a cluster
|
||||
|
||||
1. Log in to your Azure Cloud Platform account and navigate to [InfluxData's InfluxDB Enterprise (Official Version) application](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/influxdata.influxdb-enterprise-cluster) on Azure Marketplace.
|
||||
|
||||
2. Click **Get It Now**, read and agree to the terms of use, and then click **Continue**. Once in the Azure Portal, click **Create**.
|
||||
|
||||
3. Select the subscription to use for your InfluxDB Enterprise cluster. Then select a resource group and region where the cluster resources will be deployed.
|
||||
|
||||
> **Tip:** If you do not know which resource group to use, we recommend creating a new one for the InfluxDB Enterprise cluster.
|
||||
|
||||
4. In the Instance Details section, set the OS username and SSH authentication type you will use to access the cluster VMs. For password authentication, enter a username and password. For SSH public key authentication, copy an SSH public key. The cluster VMs are built from an Ubuntu base image.
|
||||
|
||||
5. Click **Next: Cluster Configuration**, and then enter details including the InfluxDB admin username and password, the number of meta and data nodes, and the VM size for both meta and data nodes. We recommend using the default VM sizes and increasing the data node VM size if you anticipate needing more resources for your workload.
|
||||
|
||||
> **Note:** Make sure to save the InfluxDB admin credentials. They will be required to access InfluxDB.
|
||||
|
||||
6. Click **Next: External Access & Chronograf**, and then do the following:
|
||||
|
||||
- To create a separate instance to monitor the cluster and run [Chronograf](https://www.influxdata.com/time-series-platform/chronograf/), select **Yes**. Otherwise, select **No**.
|
||||
|
||||
> **Note:** Adding a Chronograf instance will also configure that instance as an SSH bastion. All cluster instances will only be accessible through the Chronograf instance.
|
||||
|
||||
- Select the appropriate access for the InfluxDB load balancer: **External** to allow external Internet access; otherwise, select **Internal**.
|
||||
|
||||
{{% warn %}}The cluster uses HTTP by default. You must configure HTTPS after the cluster has been deployed.{{% /warn %}}
|
||||
|
||||
7. Click **Next: Review + create** to validate your cluster configuration details. If validation passes, your InfluxDB Enterprise cluster is deployed.
|
||||
|
||||
> **Note:** Some Azure accounts may have vCPU quotas limited to 10 vCPUs available in certain regions. Selecting VM sizes larger than the default can cause a validation error for exceeding the vCPU limit for the region.
|
||||
|
||||
## Access InfluxDB
|
||||
|
||||
Once the cluster is created, access the InfluxDB API at the IP address associated with the load balancer resource (`lb-influxdb`). If external access was configured during setup, the load balancer is publicly accessible. Otherwise, the load balancer is only accessible to the cluster's virtual network.
|
||||
|
||||
Use the load balancer IP address and the InfluxDB admin credentials entered during the cluster creation to interact with InfluxDB Enterprise via the [`influx` CLI](/enterprise_influxdb/v1.9/tools/use-influx/) or use the InfluxDB's [query](/enterprise_influxdb/v1.9/guides/query_data/) and [write](/enterprise_influxdb/v1.9/guides/write_data/) HTTP APIs.
|
||||
|
||||
## Access the cluster
|
||||
|
||||
The InfluxDB Enterprise cluster's VMs are only reachable within the virtual network using the SSH credentails provided during setup.
|
||||
|
||||
If a Chronograf instance has been added to the cluster, the Chronograf instance is publically accessible via SSH. The other VMs can then be reached from the Chronograf VM.
|
||||
|
||||
## Testing
|
||||
|
||||
Azure Resource Manager (ARM) templates used in the InfluxDB Enterprise offering on Azure Marketplace are [available for testing purposes](https://github.com/influxdata/azure-resource-manager-influxdb-enterprise). **Please note, these templates are under active development and not recommended for production.**
|
||||
|
||||
### Next steps
|
||||
|
||||
For an introduction to the InfluxDB database and the InfluxData Platform, see [Getting started with InfluxDB](/platform/introduction/getting-started).
|
|
@ -0,0 +1,100 @@
|
|||
---
|
||||
title: Deploy an InfluxDB Enterprise cluster on Google Cloud Platform
|
||||
description: >
|
||||
Deploy an InfluxDB Enterprise cluster on Google Cloud Platform (GCP).
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/other-options/google-cloud/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/google-cloud-platform/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/deploying/google-cloud-platform/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: GCP
|
||||
weight: 30
|
||||
parent: deploy-in-cloud-enterprise
|
||||
---
|
||||
|
||||
Complete the following steps to deploy an InfluxDB Enterprise cluster on Google Cloud Platform (GCP):
|
||||
|
||||
1. [Verify prerequistes](#verify-prerequisites).
|
||||
2. [Deploy a cluster](#deploy-a-cluster).
|
||||
3. [Access the cluster](#access-the-cluster).
|
||||
|
||||
After deploying your cluster, see [Getting started with InfluxDB](/platform/introduction/getting-started) for an introduction to InfluxDB database and the InfluxData platform.
|
||||
|
||||
>**Note:** InfluxDB Enterprise on GCP is a self-managed product. For a fully managed InfluxDB experience, check out [InfluxDB Cloud](/influxdb/cloud/get-started/).
|
||||
|
||||
## Verify prerequisites
|
||||
|
||||
Before deploying an InfluxDB Enterprise cluster on GCP, verify you have the following prerequisites:
|
||||
|
||||
- A [Google Cloud Platform (GCP)](https://cloud.google.com/) account with access to the [GCP Marketplace](https://cloud.google.com/marketplace/).
|
||||
- Access to [GCP Cloud Shell](https://cloud.google.com/shell/) or the [`gcloud` SDK and command line tools](https://cloud.google.com/sdk/).
|
||||
|
||||
## Deploy a cluster
|
||||
|
||||
1. Log in to your Google Cloud Platform account and go to [InfluxDB Enterprise](https://console.cloud.google.com/marketplace/details/influxdata-public/influxdb-enterprise-vm).
|
||||
|
||||
2. Click **Launch** to create or select a project to open up your cluster's configuration page.
|
||||
|
||||
3. Adjust cluster fields as needed, including:
|
||||
|
||||
- Deployment name: Enter a name for the InfluxDB Enterprise cluster.
|
||||
- InfluxDB Enterprise admin username: Enter the username of your cluster administrator.
|
||||
- Zone: Select a region for your cluster.
|
||||
- Network: Select a network for your cluster.
|
||||
- Subnetwork: Select a subnetwork for your cluster, if applicable.
|
||||
|
||||
> **Note:** The cluster is only accessible within the network (or subnetwork, if specified) where it's deployed.
|
||||
|
||||
4. Adjust data node fields as needed, including:
|
||||
|
||||
- Data node instance count: Enter the number of data nodes to include in your cluster (we recommend starting with the default, 2).
|
||||
- Data node machine type: Select the virtual machine type to use for data nodes (by default, 4 vCPUs). Use the down arrow to scroll through list. Notice the amount of memory available for the selected machine. To alter the number of cores and memory for your selected machine type, click the **Customize** link. For detail, see our recommended [hardware sizing guidelines](/enterprise_influxdb/v1.9/reference/hardware_sizing/).
|
||||
- (Optional) By default, the data node disk type is SSD Persistent Disk and the disk size is 250 GB. To alter these defaults, click More and update if needed.
|
||||
|
||||
> **Note:** Typically, fields in collapsed sections don't need to be altered.
|
||||
|
||||
5. Adjust meta node fields as needed, including:
|
||||
|
||||
- Meta node instance count: Enter the number of meta nodes to include in your cluster (we recommend using the default, 3).
|
||||
- Meta node machine type: Select the virtual machine type to use for meta nodes (by default, 1 vCPUs). Use the down arrow to scroll through list. Notice the amount of memory available for the selected machine. To alter the number of cores and memory for your selected machine type, click the **Customize** link.
|
||||
- By default, the meta node disk type is SSD Persistent Disk and the disk size is 10 GB. Alter these defaults if needed.
|
||||
|
||||
6. (Optional) Adjust boot disk options fields is needed. By default the boot disk type is Standard Persistent disk and boot disk is 10 GB .
|
||||
|
||||
7. Accept terms and conditions by selecting both check boxes, and then click **Deploy** to launch the InfluxDB Enterprise cluster.
|
||||
|
||||
The cluster may take a few minutes to fully deploy. If the deployment does not complete or reports an error, read through the list of [common deployment errors](https://cloud.google.com/marketplace/docs/troubleshooting).
|
||||
|
||||
> **Important:** Make sure you save the "Admin username", "Admin password", and "Connection internal IP" values displayed on the screen. They are required to access the cluster.
|
||||
|
||||
## Access the cluster
|
||||
|
||||
Access the cluster's IP address from the GCP network (or subnetwork) specified when you deployed the cluster. A cluster can only be reached from instances or services in the same GCP network or subnetwork.
|
||||
|
||||
1. In the GCP Cloud Shell or `gcloud` CLI, create a new instance to access the InfluxDB Enterprise cluster.
|
||||
|
||||
```
|
||||
gcloud compute instances create influxdb-access --image-family ubuntu-1804-lts --image-project ubuntu-os-cloud
|
||||
```
|
||||
|
||||
2. SSH into the instance.
|
||||
|
||||
```
|
||||
gcloud compute ssh influxdb-access
|
||||
```
|
||||
|
||||
3. On the instance, install the `influx` command line tool via the InfluxDB open source package.
|
||||
|
||||
```
|
||||
wget https://dl.influxdata.com/influxdb/releases/influxdb_{{< latest-patch >}}_amd64.deb
|
||||
sudo dpkg -i influxdb_{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
4. Access the InfluxDB Enterprise cluster using the following command with "Admin username", "Admin password", and "Connection internal IP" values from the deployment screen substituted for `<value>`.
|
||||
|
||||
```
|
||||
influx -username <Admin username> -password <Admin password> -host <Connection internal IP> -execute "CREATE DATABASE test"
|
||||
|
||||
influx -username <Admin username> -password <Admin password> -host <Connection internal IP> -execute "SHOW DATABASES"
|
||||
```
|
|
@ -0,0 +1,20 @@
|
|||
---
|
||||
title: Install an InfluxDB Enterprise cluster in your own environment
|
||||
description: Install InfluxDB Enterprise in your own on-premise environment.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/installation/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/installation/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Install in your environment
|
||||
weight: 10
|
||||
parent: Install and deploy
|
||||
---
|
||||
|
||||
Complete the following steps to install an InfluxDB Enterprise cluster in your own environment:
|
||||
|
||||
1. [Install InfluxDB Enterprise meta nodes](/enterprise_influxdb/v1.9/install-and-deploy/installation/meta_node_installation/)
|
||||
2. [Install InfluxDB data nodes](/enterprise_influxdb/v1.9/install-and-deploy/installation/data_node_installation/)
|
||||
3. [Install Chronograf](/enterprise_influxdb/v1.9/install-and-deploy/installation/chrono_install/)
|
||||
|
||||
> **Note:** If you're looking for cloud infrastructure and services, check out how to deploy InfluxDB Enterprise (production-ready) on a cloud provider of your choice: [Azure](/enterprise_influxdb/v1.9/install-and-deploy/deploying/azure/), [GCP](/enterprise_influxdb/v1.9/install-and-deploy/deploying/google-cloud-platform/), or [AWS](/enterprise_influxdb/v1.9/install-and-deploy/deploying/aws/).
|
|
@ -0,0 +1,17 @@
|
|||
---
|
||||
title: Install Chronograf
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/installation/chrono_install/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/installation/chrono_install/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Install Chronograf
|
||||
weight: 30
|
||||
parent: Install in your environment
|
||||
identifier: chrono_install
|
||||
---
|
||||
|
||||
Now that you've installed the meta nodes and data nodes, you are ready to install Chronograf
|
||||
to provide you with a user interface to access the InfluxDB Enterprise instance.
|
||||
|
||||
[Installation instruction for Chronograf](/{{< latest "chronograf" >}}/introduction/installation/)
|
|
@ -0,0 +1,288 @@
|
|||
---
|
||||
title: Install InfluxDB Enterprise data nodes
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/installation/data_node_installation/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/installation/data_node_installation/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Install data nodes
|
||||
weight: 20
|
||||
parent: Install in your environment
|
||||
---
|
||||
|
||||
InfluxDB Enterprise offers highly scalable clusters on your infrastructure
|
||||
and a management UI for working with clusters.
|
||||
The next steps will get you up and running with the second essential component of
|
||||
your InfluxDB Enterprise cluster: the data nodes.
|
||||
|
||||
{{% warn %}}
|
||||
If you have not set up your meta nodes, please visit
|
||||
[Installing meta nodes](/enterprise_influxdb/v1.9/install-and-deploy/installation/meta_node_installation/).
|
||||
Bad things can happen if you complete the following steps without meta nodes.
|
||||
{{% /warn %}}
|
||||
|
||||
# Data node setup description and requirements
|
||||
|
||||
The installation process sets up two [data nodes](/enterprise_influxdb/v1.9/concepts/glossary#data-node)
|
||||
and each data node runs on its own server.
|
||||
You **must** have a minimum of two data nodes in a cluster.
|
||||
InfluxDB Enterprise clusters require at least two data nodes for high availability and redundancy.
|
||||
<br>
|
||||
Note: that there is no requirement for each data node to run on its own
|
||||
server. However, best practices are to deploy each data node on a dedicated server.
|
||||
|
||||
See the
|
||||
[Clustering guide](/enterprise_influxdb/v1.9/concepts/clustering/#optimal-server-counts)
|
||||
for more on cluster architecture.
|
||||
|
||||
### Other requirements
|
||||
|
||||
#### License key or file
|
||||
|
||||
InfluxDB Enterprise requires a license key **OR** a license file to run.
|
||||
Your license key is available at [InfluxPortal](https://portal.influxdata.com/licenses).
|
||||
Contact support at the email we provided at signup to receive a license file.
|
||||
License files are required only if the nodes in your cluster cannot reach
|
||||
`portal.influxdata.com` on port `80` or `443`.
|
||||
|
||||
#### Networking
|
||||
|
||||
Data nodes communicate over ports `8088`, `8089`, and `8091`.
|
||||
|
||||
For licensing purposes, data nodes must also be able to reach `portal.influxdata.com`
|
||||
on port `80` or `443`.
|
||||
If the data nodes cannot reach `portal.influxdata.com` on port `80` or `443`,
|
||||
you'll need to set the `license-path` setting instead of the `license-key`
|
||||
setting in the data node configuration file.
|
||||
|
||||
#### Load balancer
|
||||
|
||||
InfluxDB Enterprise does not function as a load balancer.
|
||||
You will need to configure your own load balancer to send client traffic to the
|
||||
data nodes on port `8086` (the default port for the [HTTP API](/enterprise_influxdb/v1.9/tools/api/)).
|
||||
|
||||
#### User account
|
||||
|
||||
The installation package creates user `influxdb` that is used to run the influxdb data service. `influxdb` user also owns certain files that are needed for the service to start successfully. In some cases, local policies may prevent the local user account from being created and the service fails to start. Contact your systems administrator for assistance with this requirement.
|
||||
|
||||
# Data node setup
|
||||
## Step 1: Add appropriate DNS entries for each of your servers
|
||||
|
||||
Ensure that your servers' hostnames and IP addresses are added to your network's DNS environment.
|
||||
The addition of DNS entries and IP assignment is usually site and policy specific; contact your DNS administrator for assistance as necessary.
|
||||
Ultimately, use entries similar to the following (hostnames and domain IP addresses are representative).
|
||||
|
||||
| Record Type | Hostname | IP |
|
||||
|:------------|:-------------------------------------:|------------------:|
|
||||
| A | ```enterprise-data-01.mydomain.com``` | ```<Data_1_IP>``` |
|
||||
| A | ```enterprise-data-02.mydomain.com``` | ```<Data_2_IP>``` |
|
||||
|
||||
> **Verification steps:**
|
||||
>
|
||||
Before proceeding with the installation, verify on each meta and data server that the other
|
||||
servers are resolvable. Here is an example set of shell commands using `ping`:
|
||||
>
|
||||
ping -qc 1 enterprise-meta-01
|
||||
ping -qc 1 enterprise-meta-02
|
||||
ping -qc 1 enterprise-meta-03
|
||||
ping -qc 1 enterprise-data-01
|
||||
ping -qc 1 enterprise-data-02
|
||||
|
||||
We highly recommend that each server be able to resolve the IP from the hostname alone as shown here.
|
||||
Resolve any connectivity issues before proceeding with the installation.
|
||||
A healthy cluster requires that every meta node and data node in a cluster be able to communicate.
|
||||
|
||||
## Step 2: Set up, configure, and start the data node services
|
||||
|
||||
Perform the following steps on each data node.
|
||||
|
||||
### I. Download and install the data service
|
||||
|
||||
#### Ubuntu and Debian (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
sudo dpkg -i influxdb-data_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
#### RedHat and CentOS (64-bit)
|
||||
|
||||
```bash
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
sudo yum localinstall influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
#### Verify the authenticity of release download (recommended)
|
||||
|
||||
For added security, follow these steps to verify the signature of your InfluxDB download with `gpg`.
|
||||
|
||||
1. Download and import InfluxData's public key:
|
||||
|
||||
```
|
||||
curl -s https://repos.influxdata.com/influxdb.key | gpg --import
|
||||
```
|
||||
|
||||
2. Download the signature file for the release by adding `.asc` to the download URL.
|
||||
For example:
|
||||
|
||||
```
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm.asc
|
||||
```
|
||||
|
||||
3. Verify the signature with `gpg --verify`:
|
||||
|
||||
```
|
||||
gpg --verify influxdb-data-{{< latest-patch >}}-c{{< latest-patch >}}.x86_64.rpm.asc influxdb-data-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
The output from this command should include the following:
|
||||
|
||||
```
|
||||
gpg: Good signature from "InfluxDB Packaging Service <support@influxdb.com>" [unknown]
|
||||
```
|
||||
|
||||
### II. Edit the data node configuration files
|
||||
|
||||
First, in `/etc/influxdb/influxdb.conf`:
|
||||
|
||||
* Uncomment `hostname` at the top of the file and set it to the full hostname of the data node.
|
||||
* Uncomment `auth-enabled` in the `[http]` section and set it to `true`.
|
||||
* Uncomment `meta-auth-enabled` in the `[meta]` section and set it to `true`.
|
||||
* Uncomment `meta-internal-shared-secret` in the `[meta]` section and set it to a long pass phrase. The internal shared secret is used in JWT authentication for intra-node communication. This value must be same for all of your data nodes and match the `[meta] internal-shared-secret` value in the configuration files of your meta nodes.
|
||||
|
||||
Second, in `/etc/influxdb/influxdb.conf`, set:
|
||||
|
||||
`license-key` in the `[enterprise]` section to the license key you received on InfluxPortal **OR** `license-path` in the `[enterprise]` section to the local path to the JSON license file you received from InfluxData.
|
||||
|
||||
{{% warn %}}
|
||||
The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
|
||||
{{% /warn %}}
|
||||
|
||||
```toml
|
||||
# Change this option to true to disable reporting.
|
||||
# reporting-disabled = false
|
||||
# bind-address = ":8088"
|
||||
hostname="<enterprise-data-0x>"
|
||||
|
||||
[enterprise]
|
||||
# license-key and license-path are mutually exclusive, use only one and leave the other blank
|
||||
license-key = "<your_license_key>" # Mutually exclusive with license-path
|
||||
|
||||
# The path to a valid license file. license-key and license-path are mutually exclusive,
|
||||
# use only one and leave the other blank.
|
||||
license-path = "/path/to/readable/JSON.license.file" # Mutually exclusive with license-key
|
||||
|
||||
[meta]
|
||||
# Where the cluster metadata is stored
|
||||
dir = "/var/lib/influxdb/meta" # data nodes do require a local meta directory
|
||||
...
|
||||
# This setting must have the same value as the meta nodes' meta.auth-enabled configuration.
|
||||
meta-auth-enabled = true
|
||||
|
||||
[...]
|
||||
|
||||
[http]
|
||||
# Determines whether HTTP endpoint is enabled.
|
||||
# enabled = true
|
||||
|
||||
# The bind address used by the HTTP service.
|
||||
# bind-address = ":8086"
|
||||
|
||||
# Determines whether HTTP authentication is enabled.
|
||||
auth-enabled = true # Recommended, but not required
|
||||
|
||||
[...]
|
||||
|
||||
# The JWT auth shared secret to validate requests using JSON web tokens.
|
||||
shared-secret = "long pass phrase used for signing tokens"
|
||||
```
|
||||
|
||||
### III. Start the data service
|
||||
|
||||
On sysvinit systems, enter:
|
||||
|
||||
```bash
|
||||
service influxdb start
|
||||
```
|
||||
|
||||
On systemd systems, enter:
|
||||
|
||||
```bash
|
||||
sudo systemctl start influxdb
|
||||
```
|
||||
|
||||
> **Verification steps:**
|
||||
>
|
||||
Check to see that the process is running by entering:
|
||||
>
|
||||
ps aux | grep -v grep | grep influxdb
|
||||
>
|
||||
You should see output similar to:
|
||||
>
|
||||
influxdb 2706 0.2 7.0 571008 35376 ? Sl 15:37 0:16 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
|
||||
|
||||
|
||||
If you do not see the expected output, the process is either not launching or is exiting prematurely. Check the [logs](/enterprise_influxdb/v1.9/administration/logs/) for error messages and verify the previous setup steps are complete.
|
||||
|
||||
If you see the expected output, repeat for the remaining data nodes.
|
||||
Once all data nodes have been installed, configured, and launched, move on to the next section to join the data nodes to the cluster.
|
||||
|
||||
## Join the data nodes to the cluster
|
||||
|
||||
{{% warn %}}You should join your data nodes to the cluster only when you are adding a brand new node,
|
||||
either during the initial creation of your cluster or when growing the number of data nodes.
|
||||
If you are replacing an existing data node with `influxd-ctl update-data`, skip the rest of this guide.
|
||||
{{% /warn %}}
|
||||
|
||||
On one and only one of the meta nodes that you set up in the
|
||||
[previous document](/enterprise_influxdb/v1.9/introduction/meta_node_installation/), run:
|
||||
|
||||
```bash
|
||||
influxd-ctl add-data enterprise-data-01:8088
|
||||
|
||||
influxd-ctl add-data enterprise-data-02:8088
|
||||
```
|
||||
|
||||
The expected output is:
|
||||
|
||||
```bash
|
||||
Added data node y at enterprise-data-0x:8088
|
||||
```
|
||||
|
||||
Run the `add-data` command once and only once for each data node you are joining
|
||||
to the cluster.
|
||||
|
||||
> **Verification steps:**
|
||||
>
|
||||
Issue the following command on any meta node:
|
||||
>
|
||||
influxd-ctl show
|
||||
>
|
||||
The expected output is:
|
||||
>
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
4 enterprise-data-01:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
5 enterprise-data-02:8088 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
|
||||
>
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
|
||||
|
||||
The output should include every data node that was added to the cluster.
|
||||
The first data node added should have `ID=N`, where `N` is equal to one plus the number of meta nodes.
|
||||
In a standard three meta node cluster, the first data node should have `ID=4`
|
||||
Subsequently added data nodes should have monotonically increasing IDs.
|
||||
If not, there may be artifacts of a previous cluster in the metastore.
|
||||
|
||||
If you do not see your data nodes in the output, please retry adding them
|
||||
to the cluster.
|
||||
|
||||
Once your data nodes are part of your cluster move on to [the final step
|
||||
to set up Chronograf](/enterprise_influxdb/v1.9/install-and-deploy/installation/chrono_install).
|
|
@ -0,0 +1,263 @@
|
|||
---
|
||||
title: Install InfluxDB Enterprise meta nodes
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/installation/meta_node_installation/
|
||||
- /enterprise_influxdb/v1.9/install-and-deploy/installation/meta_node_installation/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Install meta nodes
|
||||
weight: 10
|
||||
parent: Install in your environment
|
||||
---
|
||||
|
||||
InfluxDB Enterprise offers highly scalable clusters on your infrastructure
|
||||
and a management UI ([via Chronograf](/{{< latest "chronograf" >}})) for working with clusters.
|
||||
The installation process is designed for users looking to
|
||||
deploy InfluxDB Enterprise in a production environment.
|
||||
The following steps will get you up and running with the first essential component of
|
||||
your InfluxDB Enterprise cluster: the meta nodes.
|
||||
|
||||
|
||||
To install InfluxDB Enterprise meta nodes, do the following:
|
||||
|
||||
1. Review [meta node setup and requirements](#meta-node-setup-description-and-requirements)
|
||||
2. [Set up meta nodes](#set-up-meta-nodes):
|
||||
1. [Add DNS entries](#add-dns-entries)
|
||||
2. [Set up, configure, and start the meta services](#set-up-configure-and-start-the-meta-services)
|
||||
3. [Join meta nodes to the cluster](#join-meta-nodes-to-the-cluster)
|
||||
|
||||
## Meta node setup and requirements
|
||||
|
||||
The installation process sets up three [meta nodes](/enterprise_influxdb/v1.9/concepts/glossary/#meta-node), with each meta node running on its own server.
|
||||
|
||||
InfluxDB Enterprise clusters require an *odd number* of *at least three* meta nodes
|
||||
for high availability and redundancy.
|
||||
We typically recommend three meta nodes.
|
||||
If your servers have chronic communication or reliability issues, you can try adding nodes.
|
||||
|
||||
> **Note**: Deploying multiple meta nodes on the same server is strongly discouraged
|
||||
> since it creates a larger point of potential failure if that particular server is unresponsive.
|
||||
> InfluxData recommends deploying meta nodes on relatively small footprint servers.
|
||||
|
||||
See the
|
||||
[Clustering in InfluxDB Enterprise](/enterprise_influxdb/v1.9/concepts/clustering/)
|
||||
for more on cluster architecture.
|
||||
|
||||
### Other requirements
|
||||
|
||||
#### License key or file
|
||||
|
||||
InfluxDB Enterprise requires a license key *or* a license file to run.
|
||||
Your license key is available at [InfluxPortal](https://portal.influxdata.com/licenses).
|
||||
Contact support at the email we provided at signup to receive a license file.
|
||||
License files are required only if the nodes in your cluster cannot reach
|
||||
`portal.influxdata.com` on port `80` or `443`.
|
||||
|
||||
#### Ports
|
||||
|
||||
Meta nodes communicate over ports `8088`, `8089`, and `8091`.
|
||||
|
||||
For licensing purposes, meta nodes must also be able to reach `portal.influxdata.com`
|
||||
on port `80` or `443`.
|
||||
If the meta nodes cannot reach `portal.influxdata.com` on port `80` or `443`,
|
||||
you'll need to set the `license-path` setting instead of the `license-key`
|
||||
setting in the meta node configuration file.
|
||||
|
||||
#### User account
|
||||
|
||||
The installation package creates an `influxdb` user on the operating system.
|
||||
The `influxdb` user runs the InfluxDB meta service.
|
||||
The `influxdb` user also owns certain files needed to start the service.
|
||||
In some cases, local policies may prevent the local user account from being created and the service fails to start.
|
||||
Contact your systems administrator for assistance with this requirement.
|
||||
|
||||
## Set up meta nodes
|
||||
|
||||
1. [Add DNS entries](#add-dns-entries)
|
||||
2. [Set up, configure, and start the meta services](#set-up-configure-and-start-the-meta-services)
|
||||
3. [Join meta nodes to the cluster](#join-meta-nodes-to-the-cluster)
|
||||
|
||||
### Add DNS entries
|
||||
|
||||
Ensure that your servers' hostnames and IP addresses are added to your network's DNS environment.
|
||||
The addition of DNS entries and IP assignment is usually site and policy specific; contact your DNS administrator for assistance as necessary.
|
||||
Ultimately, use entries similar to the following (hostnames and domain IP addresses are representative).
|
||||
|
||||
| Record Type | Hostname | IP |
|
||||
|:------------|:---------------------------------:|--------------:|
|
||||
| `A` | `enterprise-meta-01.mydomain.com` | `<Meta_1_IP>` |
|
||||
| `A` | `enterprise-meta-02.mydomain.com` | `<Meta_2_IP>` |
|
||||
| `A` | `enterprise-meta-03.mydomain.com` | `<Meta_3_IP>` |
|
||||
|
||||
|
||||
#### Verify DNS resolution
|
||||
|
||||
Before proceeding with the installation, verify on each server that the other
|
||||
servers are resolvable. Here is an example set of shell commands using `ping`:
|
||||
|
||||
```
|
||||
ping -qc 1 enterprise-meta-01
|
||||
ping -qc 1 enterprise-meta-02
|
||||
ping -qc 1 enterprise-meta-03
|
||||
```
|
||||
|
||||
We highly recommend that each server be able to resolve the IP from the hostname alone as shown here.
|
||||
Resolve any connectivity issues before proceeding with the installation.
|
||||
A healthy cluster requires that every meta node can communicate with every other
|
||||
meta node.
|
||||
|
||||
### Set up, configure, and start the meta services
|
||||
|
||||
Perform the following steps on each meta server.
|
||||
|
||||
#### I. Download and install the meta service
|
||||
|
||||
##### Ubuntu & Debian (64-bit)
|
||||
|
||||
```
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
sudo dpkg -i influxdb-meta_{{< latest-patch >}}-c{{< latest-patch >}}_amd64.deb
|
||||
```
|
||||
|
||||
##### RedHat & CentOS (64-bit)
|
||||
|
||||
```
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
sudo yum localinstall influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
##### Verify the authenticity of release download (recommended)
|
||||
|
||||
For added security, follow these steps to verify the signature of your InfluxDB download with `gpg`.
|
||||
|
||||
1. Download and import InfluxData's public key:
|
||||
|
||||
```
|
||||
curl -s https://repos.influxdata.com/influxdb.key | gpg --import
|
||||
```
|
||||
|
||||
2. Download the signature file for the release by adding `.asc` to the download URL.
|
||||
For example:
|
||||
|
||||
```
|
||||
wget https://dl.influxdata.com/enterprise/releases/influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm.asc
|
||||
```
|
||||
|
||||
3. Verify the signature with `gpg --verify`:
|
||||
|
||||
```
|
||||
gpg --verify influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm.asc influxdb-meta-{{< latest-patch >}}_c{{< latest-patch >}}.x86_64.rpm
|
||||
```
|
||||
|
||||
The output from this command should include the following:
|
||||
|
||||
```
|
||||
gpg: Good signature from "InfluxDB Packaging Service <support@influxdb.com>" [unknown]
|
||||
```
|
||||
|
||||
#### II. Edit the configuration file
|
||||
|
||||
In `/etc/influxdb/influxdb-meta.conf`:
|
||||
|
||||
* Uncomment `hostname` and set to the full hostname of the meta node.
|
||||
* Uncomment `internal-shared-secret` in the `[meta]` section and set it to a long pass phrase to be used in JWT authentication for intra-node communication. This value must the same for all of your meta nodes and match the `[meta] meta-internal-shared-secret` settings in the configuration files of your data nodes.
|
||||
* Set `license-key` in the `[enterprise]` section to the license key you received on InfluxPortal **OR** `license-path` in the `[enterprise]` section to the local path to the JSON license file you received from InfluxData.
|
||||
|
||||
{{% warn %}}
|
||||
The `license-key` and `license-path` settings are mutually exclusive and one must remain set to the empty string.
|
||||
{{% /warn %}}
|
||||
|
||||
```
|
||||
# Hostname advertised by this host for remote addresses. This must be resolvable by all
|
||||
# other nodes in the cluster
|
||||
hostname="<enterprise-meta-0x>"
|
||||
|
||||
[enterprise]
|
||||
# license-key and license-path are mutually exclusive, use only one and leave the other blank
|
||||
license-key = "<your_license_key>" # Mutually exclusive with license-path
|
||||
|
||||
# license-key and license-path are mutually exclusive, use only one and leave the other blank
|
||||
license-path = "/path/to/readable/JSON.license.file" # Mutually exclusive with license-key
|
||||
```
|
||||
|
||||
#### III. Start the meta service
|
||||
|
||||
On sysvinit systems, enter:
|
||||
```
|
||||
service influxdb-meta start
|
||||
```
|
||||
|
||||
On systemd systems, enter:
|
||||
```
|
||||
sudo systemctl start influxdb-meta
|
||||
```
|
||||
|
||||
#### Verify meta node process
|
||||
Check to see that the process is running by entering:
|
||||
|
||||
```
|
||||
ps aux | grep -v grep | grep influxdb-meta
|
||||
```
|
||||
|
||||
You should see output similar to:
|
||||
|
||||
```
|
||||
influxdb 3207 0.8 4.4 483000 22168 ? Ssl 17:05 0:08 /usr/bin/influxd-meta -config /etc/influxdb/influxdb-meta.conf
|
||||
```
|
||||
|
||||
> **Note:** It is possible to start the cluster with a single meta node but you
|
||||
must pass the `-single-server flag` when starting the single meta node.
|
||||
Please note that a cluster with only one meta node is **not** recommended for
|
||||
production environments.
|
||||
|
||||
### Join meta nodes to the cluster
|
||||
|
||||
From one and only one meta node, join all meta nodes including itself.
|
||||
In our example, from `enterprise-meta-01`, run:
|
||||
|
||||
```
|
||||
influxd-ctl add-meta enterprise-meta-01:8091
|
||||
influxd-ctl add-meta enterprise-meta-02:8091
|
||||
influxd-ctl add-meta enterprise-meta-03:8091
|
||||
```
|
||||
|
||||
> **Note:** Please make sure that you specify the fully qualified host name of
|
||||
the meta node during the join process.
|
||||
Please do not specify `localhost` as this can cause cluster connection issues.
|
||||
|
||||
The expected output is:
|
||||
```
|
||||
Added meta node x at enterprise-meta-0x:8091
|
||||
```
|
||||
|
||||
#### Verify cluster
|
||||
|
||||
To verify the cluster, run the following command on any meta node:
|
||||
|
||||
```
|
||||
influxd-ctl show
|
||||
```
|
||||
|
||||
The expected output is:
|
||||
|
||||
```
|
||||
Data Nodes
|
||||
==========
|
||||
ID TCP Address Version
|
||||
|
||||
Meta Nodes
|
||||
==========
|
||||
TCP Address Version
|
||||
enterprise-meta-01:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-02:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
enterprise-meta-03:8091 {{< latest-patch >}}-c{{< latest-patch >}}
|
||||
```
|
||||
|
||||
Note that your cluster must have at least three meta nodes.
|
||||
If you do not see your meta nodes in the output, retry adding them to
|
||||
the cluster.
|
||||
|
||||
Once your meta nodes are part of your cluster move on to [the next steps to
|
||||
set up your data nodes](/enterprise_influxdb/v1.9/install-and-deploy/installation/data_node_installation/).
|
||||
Please do not continue to the next steps if your meta nodes are not part of the
|
||||
cluster.
|
|
@ -0,0 +1,89 @@
|
|||
---
|
||||
title: Installation requirements
|
||||
description: Requirements for installing and deploying InfluxDB Enterprise.
|
||||
aliases:
|
||||
- /enterprise_influxdb/v1.9/introduction/meta_node_installation/
|
||||
- /enterprise_influxdb/v1.9/introduction/data_node_installation/
|
||||
- /enterprise/v1.8/introduction/installation_guidelines/
|
||||
- /enterprise_influxdb/v1.9/introduction/installation_guidelines/
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 102
|
||||
parent: Introduction
|
||||
---
|
||||
|
||||
Review the installation requirements below, and then check out available options to [install and deploy InfluxDB Enterprise](/enterprise_influxdb/v1.9/install-and-deploy/). For an overview of the architecture and concepts in an InfluxDB Enterprise cluster, review [Clustering in InfluxDB Enterprise](/enterprise_influxdb/v1.9/concepts/clustering/).
|
||||
|
||||
## Requirements for InfluxDB Enterprise clusters
|
||||
|
||||
InfluxDB Enterprise clusters require a license. To use a license key, all nodes in the cluster must be able to contact https://portal.influxdata.com via port `80` or port `443`. If nodes in the cluster cannot communicate with https://portal.influxdata.com, you must use the `license-path` configuration setting. For more information, see [Enterprise license settings](/enterprise_influxdb/v1.9/administration/config-data-nodes/#enterprise-license-settings).
|
||||
|
||||
Nodes attempt to download a new license file for the given key every four hours. If a node cannot connect to the server and retrieve a new license file, the node uses the existing license file. After a license expires, nodes have the following grace periods:
|
||||
|
||||
- If [InfluxDB daemon (`influxd`)](/enterprise_influxdb/v1.9/tools/influxd#sidebar) starts and fails to validate the license, the node has a 4-hour grace period.
|
||||
- If `influxd` starts and validates the license, and then a later license check fails, the node has a 14-day grace period.
|
||||
|
||||
### Frequently overlooked requirements
|
||||
|
||||
The following are the most frequently overlooked requirements when installing a cluster.
|
||||
|
||||
#### Ensure connectivity between machines
|
||||
|
||||
All nodes in the cluster must be able to resolve each other by hostname or IP,
|
||||
whichever is used in the configuration files.
|
||||
|
||||
For simplicity, ensure that all nodes can reach all other nodes on ports `8086`, `8088`, `8089`, and `8091`.
|
||||
If you alter the default ports in the configuration file(s), ensure the configured ports are open between the nodes.
|
||||
|
||||
#### Synchronize time between hosts
|
||||
|
||||
InfluxDB Enterprise uses hosts' local time in UTC to assign timestamps to data and for coordination purposes.
|
||||
Use the Network Time Protocol (NTP) to synchronize time between hosts.
|
||||
|
||||
#### Use SSDs
|
||||
|
||||
Clusters require sustained availability of 1000-2000 IOPS from the attached storage.
|
||||
SANs must guarantee at least 1000 IOPS is always available to InfluxDB Enterprise
|
||||
nodes or they may not be sufficient.
|
||||
SSDs are strongly recommended, and we have had no reports of IOPS contention from any customers running on SSDs.
|
||||
|
||||
#### Use three and only three meta nodes
|
||||
|
||||
Although technically the cluster can function with any number of meta nodes, the best practice is to ALWAYS have an odd number of meta nodes.
|
||||
This allows the meta nodes to reach consensus.
|
||||
An even number of meta nodes cannot achieve consensus because there can be no "deciding vote" cast between the nodes if they disagree.
|
||||
|
||||
Therefore, the minimum number of meta nodes for a high availability (HA) installation is three. The typical HA installation for InfluxDB Enterprise deploys three meta nodes.
|
||||
|
||||
Aside from three being a magic number, a three meta node cluster can tolerate the permanent loss of a single meta node with no degradation in any function or performance.
|
||||
A replacement meta node can be added to restore the cluster to full redundancy.
|
||||
A three meta node cluster that loses two meta nodes will still be able to handle
|
||||
basic writes and queries, but no new shards, databases, users, etc. can be created.
|
||||
|
||||
Running a cluster with five meta nodes does allow for the permanent loss of
|
||||
two meta nodes without impact on the cluster, but it doubles the
|
||||
Raft communication overhead.
|
||||
|
||||
#### Meta and data nodes are fully independent
|
||||
|
||||
Meta nodes run the Raft consensus protocol together, and manage the metastore of
|
||||
all shared cluster information: cluster nodes, databases, retention policies,
|
||||
shard groups, users, continuous queries, and subscriptions.
|
||||
|
||||
Data nodes store the shard groups and respond to queries.
|
||||
They request metastore information from the meta group as needed.
|
||||
|
||||
There is no requirement at all for there to be a meta process on a data node,
|
||||
or for there to be a meta process per data node.
|
||||
Three meta nodes is enough for an arbitrary number of data nodes, and for best
|
||||
redundancy, all nodes should run on independent servers.
|
||||
|
||||
#### Install Chronograf last
|
||||
|
||||
Chronograf should not be installed or configured until the
|
||||
InfluxDB Enterprise cluster is fully functional.
|
||||
|
||||
#### Set up monitoring
|
||||
|
||||
Monitoring gives you visibility into the status and performance of your cluster.
|
||||
See ["Monitor the InfluxData Platform"](/platform/monitoring/influxdata-platform/) for information on setting up monitoring for your InfluxDB Enterprise installation.
|
|
@ -0,0 +1,87 @@
|
|||
---
|
||||
title: Influx Query Language (InfluxQL)
|
||||
description: >
|
||||
Influx Query Language (InfluxQL) is Influx DB's SQL-like query language.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
weight: 70
|
||||
identifier: InfluxQL
|
||||
---
|
||||
|
||||
This section introduces InfluxQL, the InfluxDB SQL-like query language for
|
||||
working with data in InfluxDB databases.
|
||||
|
||||
## InfluxQL tutorial
|
||||
The first seven documents in this section provide a tutorial-style introduction
|
||||
to InfluxQL.
|
||||
Feel free to download the dataset provided in
|
||||
[Sample Data](/enterprise_influxdb/v1.9/query_language/data_download/) and follow along
|
||||
with the documentation.
|
||||
|
||||
#### Data exploration
|
||||
|
||||
[Data exploration](/enterprise_influxdb/v1.9/query_language/explore-data/) covers the
|
||||
query language basics for InfluxQL, including the
|
||||
[`SELECT` statement](/enterprise_influxdb/v1.9/query_language/explore-data/#the-basic-select-statement),
|
||||
[`GROUP BY` clauses](/enterprise_influxdb/v1.9/query_language/explore-data/#the-group-by-clause),
|
||||
[`INTO` clauses](/enterprise_influxdb/v1.9/query_language/explore-data/#the-into-clause), and more.
|
||||
See Data Exploration to learn about
|
||||
[time syntax](/enterprise_influxdb/v1.9/query_language/explore-data/#time-syntax) and
|
||||
[regular expressions](/enterprise_influxdb/v1.9/query_language/explore-data/#regular-expressions) in
|
||||
queries.
|
||||
|
||||
#### Schema exploration
|
||||
|
||||
[Schema exploration](/enterprise_influxdb/v1.9/query_language/explore-schema/) covers
|
||||
queries that are useful for viewing and exploring your
|
||||
[schema](/enterprise_influxdb/v1.9/concepts/glossary/#schema).
|
||||
See Schema Exploration for syntax explanations and examples of InfluxQL's `SHOW`
|
||||
queries.
|
||||
|
||||
#### Database management
|
||||
|
||||
[Database management](/enterprise_influxdb/v1.9/query_language/manage-database/) covers InfluxQL for managing
|
||||
[databases](/enterprise_influxdb/v1.9/concepts/glossary/#database) and
|
||||
[retention policies](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp) in
|
||||
InfluxDB.
|
||||
See Database Management for creating and dropping databases and retention
|
||||
policies as well as deleting and dropping data.
|
||||
|
||||
#### InfluxQL functions
|
||||
|
||||
Covers all [InfluxQL functions](/enterprise_influxdb/v1.9/query_language/functions/).
|
||||
|
||||
#### InfluxQL Continuous Queries
|
||||
|
||||
[InfluxQL Continuous Queries](/enterprise_influxdb/v1.9/query_language/continuous_queries/) covers the
|
||||
[basic syntax](/enterprise_influxdb/v1.9/query_language/continuous_queries/#basic-syntax)
|
||||
,
|
||||
[advanced syntax](/enterprise_influxdb/v1.9/query_language/continuous_queries/#advanced-syntax)
|
||||
,
|
||||
and
|
||||
[common use cases](/enterprise_influxdb/v1.9/query_language/continuous_queries/#continuous-query-use-cases)
|
||||
for
|
||||
[Continuous Queries](/enterprise_influxdb/v1.9/concepts/glossary/#continuous-query-cq).
|
||||
This page also describes how to
|
||||
[`SHOW`](/enterprise_influxdb/v1.9/query_language/continuous_queries/#listing-continuous-queries) and
|
||||
[`DROP`](/enterprise_influxdb/v1.9/query_language/continuous_queries/#deleting-continuous-queries)
|
||||
Continuous Queries.
|
||||
|
||||
#### InfluxQL mathematical operators
|
||||
|
||||
[InfluxQL mathematical operators](/enterprise_influxdb/v1.9/query_language/math_operators/)
|
||||
covers the use of mathematical operators in InfluxQL.
|
||||
|
||||
#### Authentication and authorization
|
||||
|
||||
[Authentication and authorization](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/) covers how to
|
||||
[set up authentication](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#set-up-authentication)
|
||||
and how to
|
||||
[authenticate requests](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#authenticate-requests) in InfluxDB.
|
||||
This page also describes the different
|
||||
[user types](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#user-types-and-privileges) and the InfluxQL for
|
||||
[managing database users](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#user-management-commands).
|
||||
|
||||
## InfluxQL reference
|
||||
|
||||
The [reference documentation for InfluxQL](/enterprise_influxdb/v1.9/query_language/spec/).
|
|
@ -0,0 +1,987 @@
|
|||
---
|
||||
title: InfluxQL Continuous Queries
|
||||
description: >
|
||||
Continuous queries (CQ) are InfluxQL queries that run automatically and periodically on realtime data and store query results in a specified measurement.
|
||||
menu:
|
||||
enterprise_influxdb_1_9:
|
||||
name: Continuous Queries
|
||||
weight: 50
|
||||
parent: InfluxQL
|
||||
v2: /influxdb/v2.0/process-data/
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Continuous queries (CQ) are InfluxQL queries that run automatically and
|
||||
periodically on realtime data and store query results in a
|
||||
specified measurement.
|
||||
|
||||
<table style="width:100%">
|
||||
<tr>
|
||||
<td><a href="#basic-syntax">Basic Syntax</a></td>
|
||||
<td><a href="#advanced-syntax">Advanced Syntax</a></td>
|
||||
<td><a href="#continuous-query-management">CQ Management</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="#examples-of-basic-syntax">Examples of Basic Syntax</a></td>
|
||||
<td><a href="#examples-of-advanced-syntax">Examples of Advanced Syntax</a></td>
|
||||
<td><a href="#continuous-query-use-cases">CQ Use Cases</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="#common-issues-with-basic-syntax">Common Issues with Basic Syntax</a></td>
|
||||
<td><a href="#common-issues-with-advanced-syntax">Common Issues with Advanced Syntax</a></td>
|
||||
<td><a href="#further-information">Further information</a></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## Syntax
|
||||
|
||||
### Basic syntax
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY <cq_name> ON <database_name>
|
||||
BEGIN
|
||||
<cq_query>
|
||||
END
|
||||
```
|
||||
|
||||
#### Description of basic syntax
|
||||
|
||||
##### The `cq_query`
|
||||
|
||||
The `cq_query` requires a
|
||||
[function](/enterprise_influxdb/v1.9/concepts/glossary/#function),
|
||||
an [`INTO` clause](/enterprise_influxdb/v1.9/query_language/spec/#clauses),
|
||||
and a [`GROUP BY time()` clause](/enterprise_influxdb/v1.9/query_language/spec/#clauses):
|
||||
|
||||
```sql
|
||||
SELECT <function[s]> INTO <destination_measurement> FROM <measurement> [WHERE <stuff>] GROUP BY time(<interval>)[,<tag_key[s]>]
|
||||
```
|
||||
|
||||
>**Note:** Notice that the `cq_query` does not require a time range in a `WHERE` clause.
|
||||
InfluxDB automatically generates a time range for the `cq_query` when it executes the CQ.
|
||||
Any user-specified time ranges in the `cq_query`'s `WHERE` clause will be ignored
|
||||
by the system.
|
||||
|
||||
##### Schedule and coverage
|
||||
|
||||
Continuous queries operate on real-time data.
|
||||
They use the local server’s timestamp, the `GROUP BY time()` interval, and
|
||||
InfluxDB database's preset time boundaries to determine when to execute and what time
|
||||
range to cover in the query.
|
||||
|
||||
CQs execute at the same interval as the `cq_query`'s `GROUP BY time()` interval,
|
||||
and they run at the start of the InfluxDB database's preset time boundaries.
|
||||
If the `GROUP BY time()` interval is one hour, the CQ executes at the start of
|
||||
every hour.
|
||||
|
||||
When the CQ executes, it runs a single query for the time range between
|
||||
[`now()`](/enterprise_influxdb/v1.9/concepts/glossary/#now) and `now()` minus the
|
||||
`GROUP BY time()` interval.
|
||||
If the `GROUP BY time()` interval is one hour and the current time is 17:00,
|
||||
the query's time range is between 16:00 and 16:59.999999999.
|
||||
|
||||
#### Examples of basic syntax
|
||||
|
||||
The examples below use the following sample data in the `transportation`
|
||||
database.
|
||||
The measurement `bus_data` stores 15-minute resolution data on the number of bus
|
||||
`passengers` and `complaints`:
|
||||
|
||||
```sql
|
||||
name: bus_data
|
||||
--------------
|
||||
time passengers complaints
|
||||
2016-08-28T07:00:00Z 5 9
|
||||
2016-08-28T07:15:00Z 8 9
|
||||
2016-08-28T07:30:00Z 8 9
|
||||
2016-08-28T07:45:00Z 7 9
|
||||
2016-08-28T08:00:00Z 8 9
|
||||
2016-08-28T08:15:00Z 15 7
|
||||
2016-08-28T08:30:00Z 15 7
|
||||
2016-08-28T08:45:00Z 17 7
|
||||
2016-08-28T09:00:00Z 20 7
|
||||
```
|
||||
|
||||
##### Automatically downsampling data
|
||||
|
||||
Use a simple CQ to automatically downsample data from a single field
|
||||
and write the results to another measurement in the same database.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_basic" ON "transportation"
|
||||
BEGIN
|
||||
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h)
|
||||
END
|
||||
```
|
||||
|
||||
`cq_basic` calculates the average hourly number of passengers from the
|
||||
`bus_data` measurement and stores the results in the `average_passengers`
|
||||
measurement in the `transportation` database.
|
||||
|
||||
`cq_basic` executes at one-hour intervals, the same interval as the
|
||||
`GROUP BY time()` interval.
|
||||
Every hour, `cq_basic` runs a single query that covers the time range between
|
||||
`now()` and `now()` minus the `GROUP BY time()` interval, that is, the time
|
||||
range between `now()` and one hour prior to `now()`.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```sql
|
||||
>
|
||||
At **8:00** `cq_basic` executes a query with the time range `time >= '7:00' AND time < '08:00'`.
|
||||
`cq_basic` writes one point to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 7
|
||||
>
|
||||
At **9:00** `cq_basic` executes a query with the time range `time >= '8:00' AND time < '9:00'`.
|
||||
`cq_basic` writes one point to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T08:00:00Z 13.75
|
||||
```
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "average_passengers"
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 7
|
||||
2016-08-28T08:00:00Z 13.75
|
||||
```
|
||||
|
||||
##### Automatically downsampling data into another retention policy
|
||||
|
||||
[Fully qualify](/enterprise_influxdb/v1.9/query_language/explore-data/#the-basic-select-statement)
|
||||
the destination measurement to store the downsampled data in a non-`DEFAULT`
|
||||
[retention policy](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp) (RP).
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_basic_rp" ON "transportation"
|
||||
BEGIN
|
||||
SELECT mean("passengers") INTO "transportation"."three_weeks"."average_passengers" FROM "bus_data" GROUP BY time(1h)
|
||||
END
|
||||
```
|
||||
|
||||
`cq_basic_rp` calculates the average hourly number of passengers from the
|
||||
`bus_data` measurement and stores the results in the `transportation` database,
|
||||
the `three_weeks` RP, and the `average_passengers` measurement.
|
||||
|
||||
`cq_basic_rp` executes at one-hour intervals, the same interval as the
|
||||
`GROUP BY time()` interval.
|
||||
Every hour, `cq_basic_rp` runs a single query that covers the time range between
|
||||
`now()` and `now()` minus the `GROUP BY time()` interval, that is, the time
|
||||
range between `now()` and one hour prior to `now()`.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```sql
|
||||
>
|
||||
At **8:00** `cq_basic_rp` executes a query with the time range `time >= '7:00' AND time < '8:00'`.
|
||||
`cq_basic_rp` writes one point to the `three_weeks` RP and the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 7
|
||||
>
|
||||
At **9:00** `cq_basic_rp` executes a query with the time range
|
||||
`time >= '8:00' AND time < '9:00'`.
|
||||
`cq_basic_rp` writes one point to the `three_weeks` RP and the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T08:00:00Z 13.75
|
||||
```
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "transportation"."three_weeks"."average_passengers"
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 7
|
||||
2016-08-28T08:00:00Z 13.75
|
||||
```
|
||||
|
||||
`cq_basic_rp` uses CQs and retention policies to automatically downsample data
|
||||
and keep those downsampled data for an alternative length of time.
|
||||
See the [Downsampling and Data Retention](/enterprise_influxdb/v1.9/guides/downsampling_and_retention/)
|
||||
guide for an in-depth discussion about this CQ use case.
|
||||
|
||||
##### Automatically downsampling a database with backreferencing
|
||||
|
||||
Use a function with a wildcard (`*`) and `INTO` query's
|
||||
[backreferencing syntax](/enterprise_influxdb/v1.9/query_language/explore-data/#the-into-clause)
|
||||
to automatically downsample data from all measurements and numerical fields in
|
||||
a database.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_basic_br" ON "transportation"
|
||||
BEGIN
|
||||
SELECT mean(*) INTO "downsampled_transportation"."autogen".:MEASUREMENT FROM /.*/ GROUP BY time(30m),*
|
||||
END
|
||||
```
|
||||
|
||||
`cq_basic_br` calculates the 30-minute average of `passengers` and `complaints`
|
||||
from every measurement in the `transportation` database (in this case, there's only the
|
||||
`bus_data` measurement).
|
||||
It stores the results in the `downsampled_transportation` database.
|
||||
|
||||
`cq_basic_br` executes at 30 minutes intervals, the same interval as the
|
||||
`GROUP BY time()` interval.
|
||||
Every 30 minutes, `cq_basic_br` runs a single query that covers the time range
|
||||
between `now()` and `now()` minus the `GROUP BY time()` interval, that is,
|
||||
the time range between `now()` and 30 minutes prior to `now()`.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```sql
|
||||
>
|
||||
At **7:30**, `cq_basic_br` executes a query with the time range `time >= '7:00' AND time < '7:30'`.
|
||||
`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database:
|
||||
>
|
||||
name: bus_data
|
||||
--------------
|
||||
time mean_complaints mean_passengers
|
||||
2016-08-28T07:00:00Z 9 6.5
|
||||
>
|
||||
At **8:00**, `cq_basic_br` executes a query with the time range `time >= '7:30' AND time < '8:00'`.
|
||||
`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database:
|
||||
>
|
||||
name: bus_data
|
||||
--------------
|
||||
time mean_complaints mean_passengers
|
||||
2016-08-28T07:30:00Z 9 7.5
|
||||
>
|
||||
[...]
|
||||
>
|
||||
At **9:00**, `cq_basic_br` executes a query with the time range `time >= '8:30' AND time < '9:00'`.
|
||||
`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database:
|
||||
>
|
||||
name: bus_data
|
||||
--------------
|
||||
time mean_complaints mean_passengers
|
||||
2016-08-28T08:30:00Z 7 16
|
||||
```
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "downsampled_transportation."autogen"."bus_data"
|
||||
name: bus_data
|
||||
--------------
|
||||
time mean_complaints mean_passengers
|
||||
2016-08-28T07:00:00Z 9 6.5
|
||||
2016-08-28T07:30:00Z 9 7.5
|
||||
2016-08-28T08:00:00Z 8 11.5
|
||||
2016-08-28T08:30:00Z 7 16
|
||||
```
|
||||
|
||||
##### Automatically downsampling data and configuring CQ time boundaries
|
||||
|
||||
Use an
|
||||
[offset interval](/enterprise_influxdb/v1.9/query_language/explore-data/#advanced-group-by-time-syntax)
|
||||
in the `GROUP BY time()` clause to alter both the CQ's default execution time and
|
||||
preset time boundaries.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_basic_offset" ON "transportation"
|
||||
BEGIN
|
||||
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h,15m)
|
||||
END
|
||||
```
|
||||
|
||||
`cq_basic_offset`calculates the average hourly number of passengers from the
|
||||
`bus_data` measurement and stores the results in the `average_passengers`
|
||||
measurement.
|
||||
|
||||
`cq_basic_offset` executes at one-hour intervals, the same interval as the
|
||||
`GROUP BY time()` interval.
|
||||
The 15 minute offset interval forces the CQ to execute 15 minutes after the
|
||||
default execution time; `cq_basic_offset` executes at 8:15 instead of 8:00.
|
||||
|
||||
Every hour, `cq_basic_offset` runs a single query that covers the time range
|
||||
between `now()` and `now()` minus the `GROUP BY time()` interval, that is, the
|
||||
time range between `now()` and one hour prior to `now()`.
|
||||
The 15 minute offset interval shifts forward the generated preset time boundaries in the
|
||||
CQ's `WHERE` clause; `cq_basic_offset` queries between 7:15 and 8:14.999999999 instead of 7:00 and 7:59.999999999.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```
|
||||
>
|
||||
At **8:15** `cq_basic_offset` executes a query with the time range `time >= '7:15' AND time < '8:15'`.
|
||||
`cq_basic_offset` writes one point to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:15:00Z 7.75
|
||||
>
|
||||
At **9:15** `cq_basic_offset` executes a query with the time range `time >= '8:15' AND time < '9:15'`.
|
||||
`cq_basic_offset` writes one point to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T08:15:00Z 16.75
|
||||
```
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "average_passengers"
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:15:00Z 7.75
|
||||
2016-08-28T08:15:00Z 16.75
|
||||
```
|
||||
|
||||
Notice that the timestamps are for 7:15 and 8:15 instead of 7:00 and 8:00.
|
||||
|
||||
#### Common issues with basic syntax
|
||||
|
||||
##### Handling time intervals with no data
|
||||
|
||||
CQs do not write any results for a time interval if no data fall within that
|
||||
time range.
|
||||
|
||||
Note that the basic syntax does not support using
|
||||
[`fill()`](/enterprise_influxdb/v1.9/query_language/explore-data/#group-by-time-intervals-and-fill)
|
||||
to change the value reported for intervals with no data.
|
||||
Basic syntax CQs ignore `fill()` if it's included in the CQ query.
|
||||
A possible workaround is to use the
|
||||
[advanced CQ syntax](#advanced-syntax).
|
||||
|
||||
##### Resampling previous time intervals
|
||||
|
||||
The basic CQ runs a single query that covers the time range between `now()`
|
||||
and `now()` minus the `GROUP BY time()` interval.
|
||||
See the [advanced syntax](#advanced-syntax) for how to configure the query's
|
||||
time range.
|
||||
|
||||
##### Backfilling results for older data
|
||||
|
||||
CQs operate on realtime data, that is, data with timestamps that occur
|
||||
relative to [`now()`](/enterprise_influxdb/v1.9/concepts/glossary/#now).
|
||||
Use a basic
|
||||
[`INTO` query](/enterprise_influxdb/v1.9/query_language/explore-data/#the-into-clause)
|
||||
to backfill results for data with older timestamps.
|
||||
|
||||
##### Missing tags in the CQ results
|
||||
|
||||
By default, all
|
||||
[`INTO` queries](/enterprise_influxdb/v1.9/query_language/explore-data/#the-into-clause)
|
||||
convert any tags in the source measurement to fields in the destination
|
||||
measurement.
|
||||
|
||||
Include `GROUP BY *` in the CQ to preserve tags in the destination measurement.
|
||||
|
||||
### Advanced syntax
|
||||
|
||||
```txt
|
||||
CREATE CONTINUOUS QUERY <cq_name> ON <database_name>
|
||||
RESAMPLE EVERY <interval> FOR <interval>
|
||||
BEGIN
|
||||
<cq_query>
|
||||
END
|
||||
```
|
||||
|
||||
#### Description of advanced syntax
|
||||
|
||||
##### The `cq_query`
|
||||
|
||||
See [ Description of Basic Syntax](/enterprise_influxdb/v1.9/query_language/continuous_queries/#description-of-basic-syntax).
|
||||
|
||||
##### Scheduling and coverage
|
||||
|
||||
CQs operate on real-time data. With the advanced syntax, CQs use the local
|
||||
server’s timestamp, the information in the `RESAMPLE` clause, and the InfluxDB
|
||||
server's preset time boundaries to determine when to execute and what time range to
|
||||
cover in the query.
|
||||
|
||||
CQs execute at the same interval as the `EVERY` interval in the `RESAMPLE`
|
||||
clause, and they run at the start of InfluxDB’s preset time boundaries.
|
||||
If the `EVERY` interval is two hours, InfluxDB executes the CQ at the top of
|
||||
every other hour.
|
||||
|
||||
When the CQ executes, it runs a single query for the time range between
|
||||
[`now()`](/enterprise_influxdb/v1.9/concepts/glossary/#now) and `now()` minus the `FOR` interval in the `RESAMPLE` clause.
|
||||
If the `FOR` interval is two hours and the current time is 17:00, the query's
|
||||
time range is between 15:00 and 16:59.999999999.
|
||||
|
||||
Both the `EVERY` interval and the `FOR` interval accept
|
||||
[duration literals](/enterprise_influxdb/v1.9/query_language/spec/#durations).
|
||||
The `RESAMPLE` clause works with either or both of the `EVERY` and `FOR` intervals
|
||||
configured.
|
||||
CQs default to the relevant
|
||||
[basic syntax behavior](/enterprise_influxdb/v1.9/query_language/continuous_queries/#description-of-basic-syntax)
|
||||
if the `EVERY` interval or `FOR` interval is not provided (see the first issue in
|
||||
[Common Issues with Advanced Syntax](/enterprise_influxdb/v1.9/query_language/continuous_queries/#common-issues-with-advanced-syntax)
|
||||
for an anomalous case).
|
||||
|
||||
#### Examples of advanced syntax
|
||||
|
||||
The examples below use the following sample data in the `transportation` database.
|
||||
The measurement `bus_data` stores 15-minute resolution data on the number of bus
|
||||
`passengers`:
|
||||
|
||||
```sql
|
||||
name: bus_data
|
||||
--------------
|
||||
time passengers
|
||||
2016-08-28T06:30:00Z 2
|
||||
2016-08-28T06:45:00Z 4
|
||||
2016-08-28T07:00:00Z 5
|
||||
2016-08-28T07:15:00Z 8
|
||||
2016-08-28T07:30:00Z 8
|
||||
2016-08-28T07:45:00Z 7
|
||||
2016-08-28T08:00:00Z 8
|
||||
2016-08-28T08:15:00Z 15
|
||||
2016-08-28T08:30:00Z 15
|
||||
2016-08-28T08:45:00Z 17
|
||||
2016-08-28T09:00:00Z 20
|
||||
```
|
||||
|
||||
##### Configuring execution intervals
|
||||
|
||||
Use an `EVERY` interval in the `RESAMPLE` clause to specify the CQ's execution
|
||||
interval.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_advanced_every" ON "transportation"
|
||||
RESAMPLE EVERY 30m
|
||||
BEGIN
|
||||
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h)
|
||||
END
|
||||
```
|
||||
|
||||
`cq_advanced_every` calculates the one-hour average of `passengers`
|
||||
from the `bus_data` measurement and stores the results in the
|
||||
`average_passengers` measurement in the `transportation` database.
|
||||
|
||||
`cq_advanced_every` executes at 30-minute intervals, the same interval as the
|
||||
`EVERY` interval.
|
||||
Every 30 minutes, `cq_advanced_every` runs a single query that covers the time
|
||||
range for the current time bucket, that is, the one-hour time bucket that
|
||||
intersects with `now()`.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```sql
|
||||
>
|
||||
At **8:00**, `cq_advanced_every` executes a query with the time range `WHERE time >= '7:00' AND time < '8:00'`.
|
||||
`cq_advanced_every` writes one point to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 7
|
||||
>
|
||||
At **8:30**, `cq_advanced_every` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`.
|
||||
`cq_advanced_every` writes one point to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T08:00:00Z 12.6667
|
||||
>
|
||||
At **9:00**, `cq_advanced_every` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`.
|
||||
`cq_advanced_every` writes one point to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T08:00:00Z 13.75
|
||||
```
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "average_passengers"
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 7
|
||||
2016-08-28T08:00:00Z 13.75
|
||||
```
|
||||
|
||||
Notice that `cq_advanced_every` calculates the result for the 8:00 time interval
|
||||
twice.
|
||||
First, it runs at 8:30 and calculates the average for every available data point
|
||||
between 8:00 and 9:00 (`8`,`15`, and `15`).
|
||||
Second, it runs at 9:00 and calculates the average for every available data
|
||||
point between 8:00 and 9:00 (`8`, `15`, `15`, and `17`).
|
||||
Because of the way InfluxDB
|
||||
[handles duplicate points](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)
|
||||
, the second result simply overwrites the first result.
|
||||
|
||||
##### Configuring time ranges for resampling
|
||||
|
||||
Use a `FOR` interval in the `RESAMPLE` clause to specify the length of the CQ's
|
||||
time range.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_advanced_for" ON "transportation"
|
||||
RESAMPLE FOR 1h
|
||||
BEGIN
|
||||
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m)
|
||||
END
|
||||
```
|
||||
|
||||
`cq_advanced_for` calculates the 30-minute average of `passengers`
|
||||
from the `bus_data` measurement and stores the results in the `average_passengers`
|
||||
measurement in the `transportation` database.
|
||||
|
||||
`cq_advanced_for` executes at 30-minute intervals, the same interval as the
|
||||
`GROUP BY time()` interval.
|
||||
Every 30 minutes, `cq_advanced_for` runs a single query that covers the time
|
||||
range between `now()` and `now()` minus the `FOR` interval, that is, the time
|
||||
range between `now()` and one hour prior to `now()`.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```sql
|
||||
>
|
||||
At **8:00** `cq_advanced_for` executes a query with the time range `WHERE time >= '7:00' AND time < '8:00'`.
|
||||
`cq_advanced_for` writes two points to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 6.5
|
||||
2016-08-28T07:30:00Z 7.5
|
||||
>
|
||||
At **8:30** `cq_advanced_for` executes a query with the time range `WHERE time >= '7:30' AND time < '8:30'`.
|
||||
`cq_advanced_for` writes two points to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:30:00Z 7.5
|
||||
2016-08-28T08:00:00Z 11.5
|
||||
>
|
||||
At **9:00** `cq_advanced_for` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`.
|
||||
`cq_advanced_for` writes two points to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T08:00:00Z 11.5
|
||||
2016-08-28T08:30:00Z 16
|
||||
```
|
||||
|
||||
Notice that `cq_advanced_for` will calculate the result for every time interval
|
||||
twice.
|
||||
The CQ calculates the average for the 7:30 time interval at 8:00 and at 8:30,
|
||||
and it calculates the average for the 8:00 time interval at 8:30 and 9:00.
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "average_passengers"
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:00:00Z 6.5
|
||||
2016-08-28T07:30:00Z 7.5
|
||||
2016-08-28T08:00:00Z 11.5
|
||||
2016-08-28T08:30:00Z 16
|
||||
```
|
||||
|
||||
##### Configuring execution intervals and CQ time ranges
|
||||
|
||||
Use an `EVERY` interval and `FOR` interval in the `RESAMPLE` clause to specify
|
||||
the CQ's execution interval and the length of the CQ's time range.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_advanced_every_for" ON "transportation"
|
||||
RESAMPLE EVERY 1h FOR 90m
|
||||
BEGIN
|
||||
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m)
|
||||
END
|
||||
```
|
||||
|
||||
`cq_advanced_every_for` calculates the 30-minute average of
|
||||
`passengers` from the `bus_data` measurement and stores the results in the
|
||||
`average_passengers` measurement in the `transportation` database.
|
||||
|
||||
`cq_advanced_every_for` executes at one-hour intervals, the same interval as the
|
||||
`EVERY` interval.
|
||||
Every hour, `cq_advanced_every_for` runs a single query that covers the time
|
||||
range between `now()` and `now()` minus the `FOR` interval, that is, the time
|
||||
range between `now()` and 90 minutes prior to `now()`.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```sql
|
||||
>
|
||||
At **8:00** `cq_advanced_every_for` executes a query with the time range `WHERE time >= '6:30' AND time < '8:00'`.
|
||||
`cq_advanced_every_for` writes three points to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T06:30:00Z 3
|
||||
2016-08-28T07:00:00Z 6.5
|
||||
2016-08-28T07:30:00Z 7.5
|
||||
>
|
||||
At **9:00** `cq_advanced_every_for` executes a query with the time range `WHERE time >= '7:30' AND time < '9:00'`.
|
||||
`cq_advanced_every_for` writes three points to the `average_passengers` measurement:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T07:30:00Z 7.5
|
||||
2016-08-28T08:00:00Z 11.5
|
||||
2016-08-28T08:30:00Z 16
|
||||
```
|
||||
|
||||
Notice that `cq_advanced_every_for` will calculate the result for every time
|
||||
interval twice.
|
||||
The CQ calculates the average for the 7:30 interval at 8:00 and 9:00.
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "average_passengers"
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T06:30:00Z 3
|
||||
2016-08-28T07:00:00Z 6.5
|
||||
2016-08-28T07:30:00Z 7.5
|
||||
2016-08-28T08:00:00Z 11.5
|
||||
2016-08-28T08:30:00Z 16
|
||||
```
|
||||
|
||||
##### Configuring CQ time ranges and filling empty results
|
||||
|
||||
Use a `FOR` interval and `fill()` to change the value reported for time
|
||||
intervals with no data.
|
||||
Note that at least one data point must fall within the `FOR` interval for `fill()`
|
||||
to operate.
|
||||
If no data fall within the `FOR` interval the CQ writes no points to the
|
||||
destination measurement.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "cq_advanced_for_fill" ON "transportation"
|
||||
RESAMPLE FOR 2h
|
||||
BEGIN
|
||||
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h) fill(1000)
|
||||
END
|
||||
```
|
||||
|
||||
`cq_advanced_for_fill` calculates the one-hour average of `passengers` from the
|
||||
`bus_data` measurement and stores the results in the `average_passengers`
|
||||
measurement in the `transportation` database.
|
||||
Where possible, it writes the value `1000` for time intervals with no results.
|
||||
|
||||
`cq_advanced_for_fill` executes at one-hour intervals, the same interval as the
|
||||
`GROUP BY time()` interval.
|
||||
Every hour, `cq_advanced_for_fill` runs a single query that covers the time
|
||||
range between `now()` and `now()` minus the `FOR` interval, that is, the time
|
||||
range between `now()` and two hours prior to `now()`.
|
||||
|
||||
Annotated log output on the morning of August 28, 2016:
|
||||
|
||||
```sql
|
||||
>
|
||||
At **6:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '4:00' AND time < '6:00'`.
|
||||
`cq_advanced_for_fill` writes nothing to `average_passengers`; `bus_data` has no data
|
||||
that fall within that time range.
|
||||
>
|
||||
At **7:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '5:00' AND time < '7:00'`.
|
||||
`cq_advanced_for_fill` writes two points to `average_passengers`:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T05:00:00Z 1000 <------ fill(1000)
|
||||
2016-08-28T06:00:00Z 3 <------ average of 2 and 4
|
||||
>
|
||||
[...]
|
||||
>
|
||||
At **11:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '9:00' AND time < '11:00'`.
|
||||
`cq_advanced_for_fill` writes two points to `average_passengers`:
|
||||
>
|
||||
name: average_passengers
|
||||
------------------------
|
||||
2016-08-28T09:00:00Z 20 <------ average of 20
|
||||
2016-08-28T10:00:00Z 1000 <------ fill(1000)
|
||||
>
|
||||
```
|
||||
|
||||
At **12:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '10:00' AND time < '12:00'`.
|
||||
`cq_advanced_for_fill` writes nothing to `average_passengers`; `bus_data` has no data
|
||||
that fall within that time range.
|
||||
|
||||
Here are the results:
|
||||
|
||||
```sql
|
||||
> SELECT * FROM "average_passengers"
|
||||
name: average_passengers
|
||||
------------------------
|
||||
time mean
|
||||
2016-08-28T05:00:00Z 1000
|
||||
2016-08-28T06:00:00Z 3
|
||||
2016-08-28T07:00:00Z 7
|
||||
2016-08-28T08:00:00Z 13.75
|
||||
2016-08-28T09:00:00Z 20
|
||||
2016-08-28T10:00:00Z 1000
|
||||
```
|
||||
|
||||
> **Note:** `fill(previous)` doesn’t fill the result for a time interval if the
|
||||
previous value is outside the query’s time range.
|
||||
See [Frequently Asked Questions](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#why-does-fill-previous-return-empty-results)
|
||||
for more information.
|
||||
|
||||
#### Common issues with advanced syntax
|
||||
|
||||
##### If the `EVERY` interval is greater than the `GROUP BY time()` interval
|
||||
|
||||
If the `EVERY` interval is greater than the `GROUP BY time()` interval, the CQ
|
||||
executes at the same interval as the `EVERY` interval and runs a single query
|
||||
that covers the time range between `now()` and `now()` minus the `EVERY`
|
||||
interval (not between `now()` and `now()` minus the `GROUP BY time()` interval).
|
||||
|
||||
For example, if the `GROUP BY time()` interval is `5m` and the `EVERY` interval
|
||||
is `10m`, the CQ executes every ten minutes.
|
||||
Every ten minutes, the CQ runs a single query that covers the time range
|
||||
between `now()` and `now()` minus the `EVERY` interval, that is, the time
|
||||
range between `now()` and ten minutes prior to `now()`.
|
||||
|
||||
This behavior is intentional and prevents the CQ from missing data between
|
||||
execution times.
|
||||
|
||||
##### If the `FOR` interval is less than the execution interval
|
||||
|
||||
If the `FOR` interval is less than the `GROUP BY time()` interval or, if
|
||||
specified, the `EVERY` interval, InfluxDB returns the following error:
|
||||
|
||||
```sql
|
||||
error parsing query: FOR duration must be >= GROUP BY time duration: must be a minimum of <minimum-allowable-interval> got <user-specified-interval>
|
||||
```
|
||||
|
||||
To avoid missing data between execution times, the `FOR` interval must be equal
|
||||
to or greater than the `GROUP BY time()` interval or, if specified, the `EVERY`
|
||||
interval.
|
||||
|
||||
Currently, this is the intended behavior.
|
||||
GitHub Issue [#6963](https://github.com/influxdata/influxdb/issues/6963)
|
||||
outlines a feature request for CQs to support gaps in data coverage.
|
||||
|
||||
## Continuous query management
|
||||
|
||||
Only admin users are allowed to work with CQs. For more on user privileges, see [Authentication and Authorization](/enterprise_influxdb/v1.9/administration/authentication_and_authorization/#user-types-and-privileges).
|
||||
|
||||
### Listing continuous queries
|
||||
|
||||
List every CQ on an InfluxDB instance with:
|
||||
|
||||
```sql
|
||||
SHOW CONTINUOUS QUERIES
|
||||
```
|
||||
|
||||
`SHOW CONTINUOUS QUERIES` groups results by database.
|
||||
|
||||
##### Examples
|
||||
|
||||
The output shows that the `telegraf` and `mydb` databases have CQs:
|
||||
|
||||
```sql
|
||||
> SHOW CONTINUOUS QUERIES
|
||||
name: _internal
|
||||
---------------
|
||||
name query
|
||||
|
||||
|
||||
name: telegraf
|
||||
--------------
|
||||
name query
|
||||
idle_hands CREATE CONTINUOUS QUERY idle_hands ON telegraf BEGIN SELECT min(usage_idle) INTO telegraf.autogen.min_hourly_cpu FROM telegraf.autogen.cpu GROUP BY time(1h) END
|
||||
feeling_used CREATE CONTINUOUS QUERY feeling_used ON telegraf BEGIN SELECT mean(used) INTO downsampled_telegraf.autogen.:MEASUREMENT FROM telegraf.autogen./.*/ GROUP BY time(1h) END
|
||||
|
||||
|
||||
name: downsampled_telegraf
|
||||
--------------------------
|
||||
name query
|
||||
|
||||
|
||||
name: mydb
|
||||
----------
|
||||
name query
|
||||
vampire CREATE CONTINUOUS QUERY vampire ON mydb BEGIN SELECT count(dracula) INTO mydb.autogen.all_of_them FROM mydb.autogen.one GROUP BY time(5m) END
|
||||
```
|
||||
|
||||
### Deleting continuous queries
|
||||
|
||||
Delete a CQ from a specific database with:
|
||||
|
||||
```sql
|
||||
DROP CONTINUOUS QUERY <cq_name> ON <database_name>
|
||||
```
|
||||
|
||||
`DROP CONTINUOUS QUERY` returns an empty result.
|
||||
|
||||
##### Examples
|
||||
|
||||
Drop the `idle_hands` CQ from the `telegraf` database:
|
||||
|
||||
```sql
|
||||
> DROP CONTINUOUS QUERY "idle_hands" ON "telegraf"`
|
||||
>
|
||||
```
|
||||
|
||||
### Altering continuous queries
|
||||
|
||||
CQs cannot be altered once they're created.
|
||||
To change a CQ, you must `DROP` and re`CREATE` it with the updated settings.
|
||||
|
||||
### Continuous query statistics
|
||||
|
||||
If `query-stats-enabled` is set to `true` in your `influxdb.conf` or using the `INFLUXDB_CONTINUOUS_QUERIES_QUERY_STATS_ENABLED` environment variable, data will be written to `_internal` with information about when continuous queries ran and their duration.
|
||||
Information about CQ configuration settings is available in the [Configuration](/enterprise_influxdb/v1.9/administration/config/#continuous-queries-settings) documentation.
|
||||
|
||||
> **Note:** `_internal` houses internal system data and is meant for internal use.
|
||||
The structure of and data stored in `_internal` can change at any time.
|
||||
Use of this data falls outside the scope of official InfluxData support.
|
||||
|
||||
## Continuous query use cases
|
||||
|
||||
### Downsampling and Data Retention
|
||||
|
||||
Use CQs with InfluxDB database
|
||||
[retention policies](/enterprise_influxdb/v1.9/concepts/glossary/#retention-policy-rp)
|
||||
(RPs) to mitigate storage concerns.
|
||||
Combine CQs and RPs to automatically downsample high precision data to a lower
|
||||
precision and remove the dispensable, high precision data from the database.
|
||||
|
||||
See the
|
||||
[Downsampling and data retention](/enterprise_influxdb/v1.9/guides/downsampling_and_retention/)
|
||||
guide for a detailed walkthrough of this common use case.
|
||||
|
||||
### Precalculating expensive queries
|
||||
|
||||
Shorten query runtimes by pre-calculating expensive queries with CQs.
|
||||
Use a CQ to automatically downsample commonly-queried, high precision data to a
|
||||
lower precision.
|
||||
Queries on lower precision data require fewer resources and return faster.
|
||||
|
||||
**Tip:** Pre-calculate queries for your preferred graphing tool to accelerate
|
||||
the population of graphs and dashboards.
|
||||
|
||||
### Substituting for a `HAVING` clause
|
||||
|
||||
InfluxQL does not support [`HAVING` clauses](https://en.wikipedia.org/wiki/Having_%28SQL%29).
|
||||
Get the same functionality by creating a CQ to aggregate the data and querying
|
||||
the CQ results to apply the `HAVING` clause.
|
||||
|
||||
> **Note:** InfluxQL supports [subqueries](/enterprise_influxdb/v1.9/query_language/explore-data/#subqueries) which also offer similar functionality to `HAVING` clauses.
|
||||
See [Data Exploration](/enterprise_influxdb/v1.9/query_language/explore-data/#subqueries) for more information.
|
||||
|
||||
##### Example
|
||||
|
||||
InfluxDB does not accept the following query with a `HAVING` clause.
|
||||
The query calculates the average number of `bees` at `30` minute intervals and
|
||||
requests averages that are greater than `20`.
|
||||
|
||||
```sql
|
||||
SELECT mean("bees") FROM "farm" GROUP BY time(30m) HAVING mean("bees") > 20
|
||||
```
|
||||
|
||||
To get the same results:
|
||||
|
||||
**1. Create a CQ**
|
||||
|
||||
This step performs the `mean("bees")` part of the query above.
|
||||
Because this step creates CQ you only need to execute it once.
|
||||
|
||||
The following CQ automatically calculates the average number of `bees` at
|
||||
`30` minutes intervals and writes those averages to the `mean_bees` field in the
|
||||
`aggregate_bees` measurement.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "bee_cq" ON "mydb" BEGIN SELECT mean("bees") AS "mean_bees" INTO "aggregate_bees" FROM "farm" GROUP BY time(30m) END
|
||||
```
|
||||
|
||||
**2. Query the CQ results**
|
||||
|
||||
This step performs the `HAVING mean("bees") > 20` part of the query above.
|
||||
|
||||
Query the data in the measurement `aggregate_bees` and request values of the `mean_bees` field that are greater than `20` in the `WHERE` clause:
|
||||
|
||||
```sql
|
||||
SELECT "mean_bees" FROM "aggregate_bees" WHERE "mean_bees" > 20
|
||||
```
|
||||
|
||||
### Substituting for nested functions
|
||||
|
||||
Some InfluxQL functions
|
||||
[support nesting](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#which-influxql-functions-support-nesting)
|
||||
of other functions.
|
||||
Most do not.
|
||||
If your function does not support nesting, you can get the same functionality using a CQ to calculate
|
||||
the inner-most function.
|
||||
Then simply query the CQ results to calculate the outer-most function.
|
||||
|
||||
> **Note:** InfluxQL supports [subqueries](/enterprise_influxdb/v1.9/query_language/explore-data/#subqueries) which also offer the same functionality as nested functions.
|
||||
See [Data Exploration](/enterprise_influxdb/v1.9/query_language/explore-data/#subqueries) for more information.
|
||||
|
||||
##### Example
|
||||
|
||||
InfluxDB does not accept the following query with a nested function.
|
||||
The query calculates the number of non-null values
|
||||
of `bees` at `30` minute intervals and the average of those counts:
|
||||
|
||||
```sql
|
||||
SELECT mean(count("bees")) FROM "farm" GROUP BY time(30m)
|
||||
```
|
||||
|
||||
To get the same results:
|
||||
|
||||
**1. Create a CQ**
|
||||
|
||||
This step performs the `count("bees")` part of the nested function above.
|
||||
Because this step creates a CQ you only need to execute it once.
|
||||
|
||||
The following CQ automatically calculates the number of non-null values of `bees` at `30` minute intervals
|
||||
and writes those counts to the `count_bees` field in the `aggregate_bees` measurement.
|
||||
|
||||
```sql
|
||||
CREATE CONTINUOUS QUERY "bee_cq" ON "mydb" BEGIN SELECT count("bees") AS "count_bees" INTO "aggregate_bees" FROM "farm" GROUP BY time(30m) END
|
||||
```
|
||||
|
||||
**2. Query the CQ results**
|
||||
|
||||
This step performs the `mean([...])` part of the nested function above.
|
||||
|
||||
Query the data in the measurement `aggregate_bees` to calculate the average of the
|
||||
`count_bees` field:
|
||||
|
||||
```sql
|
||||
SELECT mean("count_bees") FROM "aggregate_bees" WHERE time >= <start_time> AND time <= <end_time>
|
||||
```
|
||||
|
||||
## Further information
|
||||
|
||||
To see how to combine two InfluxDB features, CQs, and retention policies,
|
||||
to periodically downsample data and automatically expire the dispensable high
|
||||
precision data, see [Downsampling and data retention](/enterprise_influxdb/v1.9/guides/downsampling_and_retention/).
|
||||
|
||||
Kapacitor, InfluxData's data processing engine, can do the same work as
|
||||
continuous queries in InfluxDB databases.
|
||||
|
||||
To learn when to use Kapacitor instead of InfluxDB and how to perform the same CQ
|
||||
functionality with a TICKscript, see [examples of continuous queries in Kapacitor](/{{< latest "kapacitor" >}}/guides/continuous_queries/).
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue