ported influxdb 1.8, added expand shortcode

pull/1387/head
Scott Anderson 2020-07-29 15:07:24 -06:00
parent 6518a943c5
commit 99be367326
90 changed files with 29721 additions and 0 deletions

View File

@ -339,6 +339,25 @@ Truncated markdown content here.
{{% /truncate %}}
```
### Expandable accordion content blocks
Use the `{{% expand "Item label" %}}` shortcode to create expandable, accordion-style content blocks.
Each expandable block needs a label that users can click to expand or collpase the content block.
Pass the label as a string to the shortcode.
```md
{{% expand "Lable 1"}}
Markdown content associated with label 1.
{{% /expand %}}
{{% expand "Lable 2"}}
Markdown content associated with label 2.
{{% /expand %}}
{{% expand "Lable 3"}}
Markdown content associated with label 3.
{{% /expand %}}
```
### Generate a list of children articles
Section landing pages often contain just a list of articles with links and descriptions for each.
This can be cumbersome to maintain as content is added.

View File

@ -87,6 +87,12 @@ $(".truncate-toggle").click(function(e) {
$(this).closest('.truncate').toggleClass('closed');
})
////////////////////////////// Expand Accordians ///////////////////////////////
$('.expand-label').click(function() {
$(this).children('.expand-toggle').toggleClass('open')
$(this).next('.expand-content').slideToggle(200)
})
//////////////////// Replace Missing Images with Placeholder ///////////////////
$(".article--content img").on("error", function() {

View File

@ -103,6 +103,7 @@
"article/code",
"article/cloud",
"article/enterprise",
"article/expand",
"article/feedback",
"article/flex",
"article/lists",

View File

@ -0,0 +1,51 @@
// Styles for accordian-like expandable content blocks
$vertical-offset: -14px;
.expand {
border-top: 1px solid $article-hr;
padding: .75rem 0;
&:last-of-type { border-bottom: 1px solid $article-hr; }
}
.expand-label {
display: flex;
align-items: center;
font-weight: bold;
margin-bottom: 0;
cursor: pointer;
&:hover {
.expand-toggle { background: $article-link; }
}
}
.expand-toggle {
position: relative;
display: inline-block;
min-height: 20px;
min-width: 20px;
background: rgba($article-text, .25);
border-radius: 50%;
margin-right: .75rem;
transition: background-color .2s;
&:before, &:after {
content: "";
display: block;
width: 10px;
height: 2px;
position: absolute;
background: $article-bg;
transition: all .4s;
top: 9px;
left: 5px;
}
&:after {
transform: rotate(90deg);
}
&.open {
&:before, &:after { transform: rotate(180deg); }
}
}

View File

@ -0,0 +1,32 @@
---
title: InfluxDB 1.8 documentation
menu:
influxdb:
name: v1.8
identifier: influxdb_1_8
weight: 1
---
InfluxDB is a [time series database](https://www.influxdata.com/time-series-database/) designed to handle high write and query loads.
It is an integral component of the
[TICK stack](https://influxdata.com/time-series-platform/).
InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.
## Key features
Here are some of the features that InfluxDB currently supports that make it a great choice for working with time series data.
* Custom high performance datastore written specifically for time series data.
The TSM engine allows for high ingest speed and data compression
* Written entirely in Go.
It compiles into a single binary with no external dependencies.
* Simple, high performing write and query HTTP APIs.
* Plugins support for other data ingestion protocols such as Graphite, collectd, and OpenTSDB.
* Expressive SQL-like query language tailored to easily query aggregated data.
* Tags allow series to be indexed for fast and efficient queries.
* Retention policies efficiently auto-expire stale data.
* Continuous queries automatically compute aggregate data to make frequent queries more efficient.
The open source edition of InfluxDB runs on a single node.
If you require high availability to eliminate a single point of failure, consider the [InfluxDB Enterprise Edition](https://docs.influxdata.com/influxdb/latest/high_availability/).

View File

@ -0,0 +1,31 @@
---
title: About InfluxDB OSS
alias:
-/docs/v1.8/about/
menu:
influxdb_1_8:
name: About the project
weight: 10
---
## [Release notes](/influxdb/v1.8/about_the_project/releasenotes-changelog/)
Details about features, bug fixes, and breaking changes for the current and earlier InfluxDB open source (OSS) releases are available in the [InfluxDB OSS release notes](/influxdb/v1.8/about_the_project/releasenotes-changelog/).
## [Contributing to InfluxDB](/influxdb/v1.8/about_the_project/contributing/)
To learn how you can contribute to the InfluxDB OSS project, see [Contributing to InfluxDB OSS](https://github.com/influxdata/influxdb/tree/1.8/CONTRIBUTING.md) in the InfluxDB OSS GitHub project.
## [InfluxData Contributor License Agreement (CLA)](/influxdb/v1.8/about_the_project/cla/)
Before contributing to the InfluxDB OSS project, you must complete and sign
the [InfluxData Contributor License Agreement (CLA)](https://www.influxdata.com/legal/cla/).
## [InfluxDB open source license](/influxdb/v1.8/about_the_project/licenses/)
The [open source license for InfluxDB](https://github.com/influxdata/influxdb/blob/master/LICENSE)
is available in the GitHub repository.
## [Third party software](/influxdb/v1.8/about_the_project/third-party/)
The [list of third party software components, including references to associated licenses and other materials](https://github.com/influxdata/influxdb/blob/1.8/DEPENDENCIES.md), is maintained on a version by version basis.

View File

@ -0,0 +1,12 @@
---
title: InfluxData Contributor License Agreement (CLA)
menu:
influxdb_1_8:
name: Contributor license agreement
weight: 30
parent: About the project
---
Before contributing to the InfluxDB OSS project, you must complete and sign
the [InfluxData Contributor License Agreement (CLA)](https://www.influxdata.com/legal/cla/),
available on the InfluxData website.

View File

@ -0,0 +1,12 @@
---
title: Contribute to InfluxDB OSS
menu:
influxdb_1_8:
name: Contribute to InfluxDB
weight: 20
parent: About the project
---
To learn how you can contribute to the InfluxDB OSS project, see
[Contributing to InfluxDB](https://github.com/influxdata/influxdb/tree/1.8/CONTRIBUTING.md)
in the GitHub repository.

View File

@ -0,0 +1,11 @@
---
title: Open source license for InfluxDB
menu:
influxdb_1_8:
name: InfluxDB license
weight: 40
parent: About the project
---
The [open source license for InfluxDB](https://github.com/influxdata/influxdb/blob/master/LICENSE)
is available in the GitHub repository.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,20 @@
---
title: Third party software
menu:
influxdb_1_8:
name: Third party software
weight: 50
parent: About the project
---
InfluxData products contain third party software, which means the copyrighted,
patented, or otherwise legally protected software of third parties that is
incorporated in InfluxData products.
Third party suppliers make no representation nor warranty with respect to
such third party software or any portion thereof.
Third party suppliers assume no liability for any claim that might arise with
respect to such third party software, nor for a
customers use of or inability to use the third party software.
The [list of third party software components, including references to associated licenses and other materials](https://github.com/influxdata/influxdb/blob/1.8/DEPENDENCIES.md), is maintained on a version by version basis.

View File

@ -0,0 +1,36 @@
---
title: Additional InfluxDB resources
description: InfluxDB resources, including InfluxData blog, technical papers, meetup and training videos, and upcoming virtual training and other events.
menu:
influxdb_1_8:
name: Additional resources
weight: 120
---
Check out the following InfluxData resources to learn more about InfluxDB OSS and other InfluxData products.
## [InfluxData blog](https://www.influxdata.com/blog/)
Check out the [InfluxData blog](https://www.influxdata.com/blog/) for announcements, updates, and
weekly [tech tips](https://www.influxdata.com/category/tech-tips/).
## [Technical papers](https://www.influxdata.com/_resources/techpapers-new/)
[InfluxData technical papers](https://www.influxdata.com/_resources/techpapers-new/) series offer in-depth analysis on performance, time series,
and benchmarking of InfluxDB compared to other popular databases.
## [Meetup videos](https://www.influxdata.com/_resources/videosnew//)
Check out our growing collection of [meetup videos](https://www.influxdata.com/_resources/videosnew//) for introductory content, how-tos, and more.
## [Virtual training videos](https://www.influxdata.com/_resources/videosnew/)
Watch [virtual training videos](https://www.influxdata.com/_resources/videosnew/) from our weekly training webinar.
## [Virtual training schedule](https://www.influxdata.com/virtual-training-courses/)
Check out our [virtual training schedule](https://www.influxdata.com/virtual-training-courses/) to register for future webinars.
## [InfluxData events](https://www.influxdata.com/events/)
Learn about and sign up for upcoming [InfluxData events](https://www.influxdata.com/events/).

View File

@ -0,0 +1,57 @@
---
title: Administer InfluxDB
menu:
influxdb_1_8:
name: Administration
weight: 50
---
The administration documentation contains all the information needed to administer a working InfluxDB installation.
## [Configuring InfluxDB](/influxdb/v1.8/administration/config/)
Information about the config file `influx.conf`
#### [Authentication and authorization](/influxdb/v1.8/administration/authentication_and_authorization/)
Covers how to
[set up authentication](/influxdb/v1.8/administration/authentication_and_authorization/#set-up-authentication)
and how to
[authenticate requests](/influxdb/v1.8/administration/authentication_and_authorization/#authenticate-requests) in InfluxDB.
This page also describes the different
[user types](/influxdb/v1.8/administration/authentication_and_authorization/#user-types-and-privileges) and the InfluxQL for
[managing database users](/influxdb/v1.8/administration/authentication_and_authorization/#user-management-commands).
## [Upgrading](/influxdb/v1.8/administration/upgrading/)
Information about upgrading from previous versions of InfluxDB
## [Enabling HTTPS](/influxdb/v1.8/administration/https_setup/)
Enabling HTTPS encrypts the communication between clients and the InfluxDB server.
HTTPS can also verify the authenticity of the InfluxDB server to connecting clients.
## [Logging in InfluxDB](/influxdb/v1.8/administration/logs/)
Information on how to direct InfluxDB log output.
## [Ports](/influxdb/v1.8/administration/ports/)
## [Backing up and restoring](/influxdb/v1.8/administration/backup_and_restore/)
Procedures to backup data created by InfluxDB and to restore from a backup.
## [Managing security](/influxdb/v1.8/administration/security/)
Overview of security options and configurations.
## [Stability and compatibility](/influxdb/v1.8/administration/stability_and_compatibility/)
Management of breaking changes, upgrades, and ongoing support.
## Downgrading
To revert to a prior version, complete the same steps as when [Upgrading to InfluxDB 1.8.x](/influxdb/v1.8/administration/upgrading/), replacing 1.8.x with the version you want to downgrade to. After downloading the release, migrating your configuration settings, and enabling TSI or TSM, make sure to [rebuild your index](/influxdb/v1.8/administration/rebuild-tsi-index/#sidebar).
>**Note:** Some versions of InfluxDB may have breaking changes that impact your ability to upgrade and downgrade. For example, you cannot downgrade from InfluxDB 1.3 or later to an earlier version. Please review the applicable version of release notes to check for compatibility issues between releases.

View File

@ -0,0 +1,504 @@
---
title: Authentication and authorization in InfluxDB
aliases:
- influxdb/v1.8/administration/authentication_and_authorization/
menu:
influxdb_1_8:
name: Manage authentication and authorization
weight: 20
parent: Administration
---
This document covers setting up and managing authentication and authorization in InfluxDB.
- [Authentication](#authentication)
- [Set up Authentication](#set-up-authentication")
- [Authenticate Requests](#authenticate-requests)
- [Authorization](#authorization)
- [User Types and Privileges](#user-types-and-privileges)
- [User Management Commands](#user-management-commands)
- [HTTP Errors](#authentication-and-authorization-http-errors)
> **Note:** Authentication and authorization should not be relied upon to prevent access and protect data from malicious actors.
If additional security or compliance features are desired, InfluxDB should be run behind a third-party service. If InfluxDB is
being deployed on a publicly accessible endpoint, we strongly recommend authentication be enabled. Otherwise the data will be
publicly available to any unauthenticated user.
## Authentication
The InfluxDB API and the [command line interface](/influxdb/v1.8/tools/shell/) (CLI), which connects to the database using the API, include simple, built-in authentication based on user credentials.
When you enable authentication, InfluxDB only executes HTTP requests that are sent with valid credentials.
> **Note:** Authentication only occurs at the HTTP request scope.
Plugins do not currently have the ability to authenticate requests and service endpoints (for example, Graphite, collectd, etc.) are not authenticated.
### Set up authentication
#### 1. Create at least one [admin user](#admin-users).
See the [authorization section](#authorization) for how to create an admin user.
> **Note:** If you enable authentication and have no users, InfluxDB will **not** enforce authentication and will only accept the [query](#user-management-commands) that creates a new admin user.
InfluxDB will enforce authentication once there is an admin user.
#### 2. By default, authentication is disabled in the configuration file.
Enable authentication by setting the `auth-enabled` option to `true` in the `[http]` section of the configuration file:
```toml
[http]
enabled = true
bind-address = ":8086"
auth-enabled = true # ✨
log-enabled = true
write-tracing = false
pprof-enabled = true
pprof-auth-enabled = true
debug-pprof-enabled = false
ping-auth-enabled = true
https-enabled = true
https-certificate = "/etc/ssl/influxdb.pem"
```
{{% note %}}
If `pprof-enabled` is set to `true`, set `pprof-auth-enabled` and `ping-auth-enabled`
to `true` to require authentication on profiling and ping endpoints.
{{% /note %}}
#### 3. Restart the process
Now InfluxDB will check user credentials on every request and will only process requests that have valid credentials for an existing user.
### Authenticate requests
#### Authenticate with the InfluxDB API
There are two options for authenticating with the [InfluxDB API](/influxdb/v1.8/tools/api/).
If you authenticate with both Basic Authentication **and** the URL query parameters, the user credentials specified in the query parameters take precedence.
The queries in the following examples assume that the user is an [admin user](#admin-users).
See the section on [authorization](#authorization) for the different user types, their privileges, and more on user management.
> **Note:** InfluxDB redacts passwords when you enable authentication.
##### Authenticate with Basic Authentication as described in [RFC 2617, Section 2](http://tools.ietf.org/html/rfc2617)
This is the preferred method for providing user credentials.
Example:
```bash
curl -G http://localhost:8086/query -u todd:influxdb4ever --data-urlencode "q=SHOW DATABASES"
```
##### Authenticate by providing query parameters in the URL or request body
Set `u` as the username and `p` as the password.
###### Example using query parameters
```bash
curl -G "http://localhost:8086/query?u=todd&p=influxdb4ever" --data-urlencode "q=SHOW DATABASES"
```
###### Example using request body
```bash
curl -G http://localhost:8086/query --data-urlencode "u=todd" --data-urlencode "p=influxdb4ever" --data-urlencode "q=SHOW DATABASES"
```
#### Authenticate with the CLI
There are three options for authenticating with the [CLI](/influxdb/v1.8/tools/shell/).
##### Authenticate with the `INFLUX_USERNAME` and `INFLUX_PASSWORD` environment variables
Example:
```bash
export INFLUX_USERNAME=todd
export INFLUX_PASSWORD=influxdb4ever
echo $INFLUX_USERNAME $INFLUX_PASSWORD
todd influxdb4ever
influx
Connected to http://localhost:8086 version 1.4.x
InfluxDB shell 1.4.x
```
##### Authenticate by setting the `username` and `password` flags when you start the CLI
Example:
```bash
influx -username todd -password influxdb4ever
Connected to http://localhost:8086 version 1.4.x
InfluxDB shell 1.4.x
```
##### Authenticate with `auth <username> <password>` after starting the CLI
Example:
```bash
influx
Connected to http://localhost:8086 version 1.4.x
InfluxDB shell 1.4.x
> auth
username: todd
password:
>
```
#### Authenticate using JWT tokens
Passing JWT tokens in each request is a more secure alternative to using passwords.
This is currently only possible through the [InfluxDB HTTP API](/influxdb/v1.8/tools/api/).
##### 1. Add a shared secret in your InfluxDB configuration file
InfluxDB uses the shared secret to encode the JWT signature.
By default, `shared-secret` is set to an empty string, in which case no JWT authentication takes place.
Add a custom shared secret in your [InfluxDB configuration file](/influxdb/v1.8/administration/config/#shared-secret).
The longer the secret string, the more secure it is:
```
[http]
shared-secret = "my super secret pass phrase"
```
Alternatively, to avoid keeping your secret phrase as plain text in your InfluxDB configuration file, set the value with the `INFLUXDB_HTTP_SHARED_SECRET` environment variable.
##### 2. Generate your token
Use an authentication service to generate a secure token using your InfluxDB username, an expiration time, and your shared secret.
There are online tools, such as [https://jwt.io/](https://jwt.io/), that will do this for you.
The payload (or claims) of the token must be in the following format:
```
{
"username": "myUserName",
"exp": 1516239022
}
```
**username** - The name of your InfluxDB user.
**exp** - The expiration time of the token in UNIX epoch time.
For increased security, keep token expiration periods short.
For testing, you can manually generate UNIX timestamps using [https://www.unixtimestamp.com/index.php](https://www.unixtimestamp.com/index.php).
Encode the payload using your shared secret.
You can do this with either a JWT library in your own authentication server or by hand at [https://jwt.io/](https://jwt.io/).
The generated token should look similar to the following:
```
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.he0ErCNloe4J7Id0Ry2SEDg09lKkZkfsRiGsdX_vgEg
```
##### 3. Include the token in HTTP requests
Include your generated token as part of the ``Authorization`` header in HTTP requests.
Use the ``Bearer`` authorization scheme:
```
Authorization: Bearer <myToken>
```
{{% note %}}
Only unexpired tokens will successfully authenticate.
Be sure your token has not expired.
{{% /note %}}
###### Example query request with JWT authentication
```bash
curl -XGET "http://localhost:8086/query?db=demodb" \
--data-urlencode "q=SHOW DATABASES" \
--header "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.he0ErCNloe4J7Id0Ry2SEDg09lKkZkfsRiGsdX_vgEg"
```
## Authenticate Telegraf requests to InfluxDB
Authenticating [Telegraf](/telegraf/latest/) requests to an InfluxDB instance with
authentication enabled requires some additional steps.
In the Telegraf configuration file (`/etc/telegraf/telegraf.conf`), uncomment
and edit the `username` and `password` settings.
```toml
>
###############################################################################
# OUTPUT PLUGINS #
###############################################################################
>
[...]
>
## Write timeout (for the InfluxDB client), formatted as a string.
## If not provided, will default to 5s. 0s means no timeout (not recommended).
timeout = "5s"
username = "telegraf" #💥
password = "metricsmetricsmetricsmetrics" #💥
>
[...]
```
Next, restart Telegraf and you're all set!
## Authorization
Authorization is only enforced once you've [enabled authentication](#set-up-authentication).
By default, authentication is disabled, all credentials are silently ignored, and all users have all privileges.
### User types and privileges
#### Admin users
Admin users have `READ` and `WRITE` access to all databases and full access to the following administrative queries:
Database management:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`CREATE DATABASE`, and `DROP DATABASE`
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`DROP SERIES` and `DROP MEASUREMENT`
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`CREATE RETENTION POLICY`, `ALTER RETENTION POLICY`, and `DROP RETENTION POLICY`
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`CREATE CONTINUOUS QUERY` and `DROP CONTINUOUS QUERY`
See the [database management](/influxdb/v1.8/query_language/database_management/) and [continuous queries](/influxdb/v1.8/query_language/continuous_queries/) pages for a complete discussion of the commands listed above.
User management:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Admin user management:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[`CREATE USER`](#user-management-commands), [`GRANT ALL PRIVILEGES`](#grant-administrative-privileges-to-an-existing-user), [`REVOKE ALL PRIVILEGES`](#revoke-administrative-privileges-from-an-admin-user), and [`SHOW USERS`](#show-all-existing-users-and-their-admin-status)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Non-admin user management:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[`CREATE USER`](#user-management-commands), [`GRANT [READ,WRITE,ALL]`](#grant-read-write-or-all-database-privileges-to-an-existing-user), [`REVOKE [READ,WRITE,ALL]`](#revoke-read-write-or-all-database-privileges-from-an-existing-user), and [`SHOW GRANTS`](#show-a-user-s-database-privileges)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;General user management:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[`SET PASSWORD`](#re-set-a-user-s-password) and [`DROP USER`](#drop-a-user)
See [below](#user-management-commands) for a complete discussion of the user management commands.
#### Non-admin users
Non-admin users can have one of the following three privileges per database:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`READ`
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`WRITE`
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`ALL` (both `READ` and `WRITE` access)
`READ`, `WRITE`, and `ALL` privileges are controlled per user per database. A new non-admin user has no access to any database until they are specifically [granted privileges to a database](#grant-read-write-or-all-database-privileges-to-an-existing-user) by an admin user.
Non-admin users can [`SHOW`](/influxdb/v1.8/query_language/schema_exploration/#show-databases) the databases on which they have `READ` and/or `WRITE` permissions.
### User management commands
#### Admin user management
When you enable HTTP authentication, InfluxDB requires you to create at least one admin user before you can interact with the system.
`CREATE USER admin WITH PASSWORD '<password>' WITH ALL PRIVILEGES`
##### `CREATE` another admin user
```sql
CREATE USER <username> WITH PASSWORD '<password>' WITH ALL PRIVILEGES
```
CLI example:
```sql
> CREATE USER paul WITH PASSWORD 'timeseries4days' WITH ALL PRIVILEGES
>
```
> **Note:** Repeating the exact `CREATE USER` statement is idempotent. If any values change the database will return a duplicate user error. See GitHub Issue [#6890](https://github.com/influxdata/influxdb/pull/6890) for details.
>
CLI example:
>
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
> CREATE USER todd WITH PASSWORD '123' WITH ALL PRIVILEGES
ERR: user already exists
> CREATE USER todd WITH PASSWORD '123456'
ERR: user already exists
> CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES
>
##### `GRANT` administrative privileges to an existing user
```sql
GRANT ALL PRIVILEGES TO <username>
```
CLI example:
```sql
> GRANT ALL PRIVILEGES TO "todd"
>
```
##### `REVOKE` administrative privileges from an admin user
```sql
REVOKE ALL PRIVILEGES FROM <username>
```
CLI example:
```sql
> REVOKE ALL PRIVILEGES FROM "todd"
>
```
##### `SHOW` all existing users and their admin status
```sql
SHOW USERS
```
CLI example:
```sql
> SHOW USERS
user admin
todd false
paul true
hermione false
dobby false
```
#### Non-admin user management
##### `CREATE` a new non-admin user
```sql
CREATE USER <username> WITH PASSWORD '<password>'
```
CLI example:
```sql
> CREATE USER todd WITH PASSWORD 'influxdb41yf3'
> CREATE USER alice WITH PASSWORD 'wonder\'land'
> CREATE USER "rachel_smith" WITH PASSWORD 'asdf1234!'
> CREATE USER "monitoring-robot" WITH PASSWORD 'XXXXX'
> CREATE USER "$savyadmin" WITH PASSWORD 'm3tr1cL0v3r'
>
```
> **Notes:**
>
* The user value must be wrapped in double quotes if it starts with a digit, is an InfluxQL keyword, contains a hyphen and or includes any special characters, for example: `!@#$%^&*()-`
* The password [string](/influxdb/v1.8/query_language/spec/#strings) must be wrapped in single quotes.
Do not include the single quotes when authenticating requests.
We recommend avoiding the single quote (`'`) and backslash (`\`) characters in passwords.
For passwords that include these characters, escape the special character with a backslash (e.g. (`\'`) when creating the password and when submitting authentication requests.
>
* Repeating the exact `CREATE USER` statement is idempotent. If any values change the database will return a duplicate user error. See GitHub Issue [#6890](https://github.com/influxdata/influxdb/pull/6890) for details.
>
CLI example:
>
> CREATE USER "todd" WITH PASSWORD '123456'
> CREATE USER "todd" WITH PASSWORD '123456'
> CREATE USER "todd" WITH PASSWORD '123'
ERR: user already exists
> CREATE USER "todd" WITH PASSWORD '123456'
> CREATE USER "todd" WITH PASSWORD '123456' WITH ALL PRIVILEGES
ERR: user already exists
> CREATE USER "todd" WITH PASSWORD '123456'
>
##### `GRANT` `READ`, `WRITE` or `ALL` database privileges to an existing user
```sql
GRANT [READ,WRITE,ALL] ON <database_name> TO <username>
```
CLI examples:
`GRANT` `READ` access to `todd` on the `NOAA_water_database` database:
```sql
> GRANT READ ON "NOAA_water_database" TO "todd"
>
```
`GRANT` `ALL` access to `todd` on the `NOAA_water_database` database:
```sql
> GRANT ALL ON "NOAA_water_database" TO "todd"
>
```
##### `REVOKE` `READ`, `WRITE`, or `ALL` database privileges from an existing user
```
REVOKE [READ,WRITE,ALL] ON <database_name> FROM <username>
```
CLI examples:
`REVOKE` `ALL` privileges from `todd` on the `NOAA_water_database` database:
```sql
> REVOKE ALL ON "NOAA_water_database" FROM "todd"
>
```
`REVOKE` `WRITE` privileges from `todd` on the `NOAA_water_database` database:
```sql
> REVOKE WRITE ON "NOAA_water_database" FROM "todd"
>
```
>**Note:** If a user with `ALL` privileges has `WRITE` privileges revoked, they are left with `READ` privileges, and vice versa.
##### `SHOW` a user's database privileges
```sql
SHOW GRANTS FOR <user_name>
```
CLI example:
```sql
> SHOW GRANTS FOR "todd"
database privilege
NOAA_water_database WRITE
another_database_name READ
yet_another_database_name ALL PRIVILEGES
one_more_database_name NO PRIVILEGES
```
#### General admin and non-admin user management
##### Re`SET` a user's password
```sql
SET PASSWORD FOR <username> = '<password>'
```
CLI example:
```sql
> SET PASSWORD FOR "todd" = 'influxdb4ever'
>
```
{{% note %}}
**Note:** The password [string](/influxdb/v1.8/query_language/spec/#strings) must be wrapped in single quotes.
Do not include the single quotes when authenticating requests.
We recommend avoiding the single quote (`'`) and backslash (`\`) characters in passwords
For passwords that include these characters, escape the special character with a backslash (e.g. (`\'`) when creating the password and when submitting authentication requests.
{{% /note %}}
##### `DROP` a user
```sql
DROP USER <username>
```
CLI example:
```sql
> DROP USER "todd"
>
```
## Authentication and authorization HTTP errors
Requests with no authentication credentials or incorrect credentials yield the `HTTP 401 Unauthorized` response.
Requests by unauthorized users yield the `HTTP 403 Forbidden` response.

View File

@ -0,0 +1,399 @@
---
title: Back up and restore InfluxDB OSS
description: Using InfluxDB OSS backup and restore utilities for online, Enterprise-compatible use and portability between InfluxDB Enterprise and InfluxDB OSS servers.
aliases:
- /influxdb/v1.8/administration/backup-and-restore/
menu:
influxdb_1_8:
name: Back up and restore
weight: 60
parent: Administration
---
## Overview
The InfluxDB OSS `backup` utility provides:
* Option to run backup and restore functions on online (live) databases.
* Backup and restore functions for single or multiple databases, along with optional timestamp filtering.
* Data can be imported from [InfluxDB Enterprise](/enterprise_influxdb/latest/) clusters
* Backup files that can be imported into an InfluxDB Enterprise database.
> **InfluxDB Enterprise users:** See [Backing up and restoring in InfluxDB Enterprise](/enterprise_influxdb/latest/administration/backup-and-restore/).
> ***Note:*** Prior to InfluxDB OSS 1.5, the `backup` utility created backup file formats incompatible with InfluxDB Enterprise. This legacy format is still supported in the new `backup` utility as input for the new *online* restore function. The *offline* backup and restore utilities in InfluxDB OSS versions 1.4 and earlier are deprecated, but are documented below in [Backward compatible offline backup and restore](#backward-compatible-offline-backup-and-restore-legacy-format).
## Online backup and restore (for InfluxDB OSS)
Use the `backup` and `restore` utilities to back up and restore between `influxd` instances with the same versions or with only minor version differences. For example, you can back up from 1.7.3 and restore on 1.8.0.
### Configuring remote connections
The online backup and restore processes execute over a TCP connection to the database.
**To enable the port for the backup and restore service:**
1. At the root level of the InfluxDB config file (`influxdb.conf`), uncomment the [`bind-address` configuration setting](/influxdb/v1.8/administration/config#bind-address-127-0-0-1-8088) on the remote node.
2. Update the `bind-address` value to `<remote-node-IP>:8088`
3. Provide the IP address and port to the `-host` parameter when you run commands.
**Example**
```
$ influxd backup -portable -database mydatabase -host <remote-node-IP>:8088 /tmp/mysnapshot
```
### `backup`
`backup` generates an InfluxDB Enterprise-compatible format with filtering options to constrain the range of data points exported to the backup. `backup` creates and stores the following in a specified directory (filenames include UTC timestamp when created):
- copy of metastore **on disk**: 20060102T150405Z.meta (includes usernames and passwords)
- copy of shard data **on disk**: 20060102T150405Z.<shard_id>.tar.gz
- manifest (JSON file) describes collected backup data: 20060102T150405Z.manifest
>**Note:** `backup` ignores WAL files and in-memory cache data.
```
influxd backup
[ -database <db_name> ]
[ -portable ]
[ -host <host:port> ]
[ -retention <rp_name> ] | [ -shard <shard_ID> -retention <rp_name> ]
[ -start <timestamp> [ -end <timestamp> ] | -since <timestamp> ]
<path-to-backup>
```
To invoke the new InfluxDB Enterprise-compatible format, run the `influxd backup` command with the `-portable` flag, like this:
```
influxd backup -portable [ arguments ] <path-to-backup>
```
##### Arguments
Optional arguments are enclosed in brackets.
- `[ -database <db_name> ]`: The database to back up. If not specified, all databases are backed up.
- `[ -portable ]`: Generates backup files in the newer InfluxDB Enterprise-compatible format. Highly recommended for all InfluxDB OSS users.
{{% warn %}}
**Important:** If `-portable` is not specified, the default legacy backup utility is used -- only the host metastore is backed up, unless `-database` is specified. If not using `-portable`, review [Backup (legacy)](#backup-legacy) below for expected behavior.
{{% /warn %}}
- `[ -host <host:port> ]`: Host and port for InfluxDB OSS instance . Default value is `'127.0.0.1:8088'`. Required for remote connections. Example: `-host 127.0.0.1:8088`
- `[ -retention <rp_name> ]`: Retention policy for the backup. If not specified, the default is to use all retention policies. If specified, then `-database` is required.
- `[ -shard <ID> ]`: Shard ID of the shard to be backed up. If specified, then `-retention <name>` is required.
- `[ -start <timestamp> ]`: Include all points starting with the specified timestamp ([RFC3339 format](https://www.ietf.org/rfc/rfc3339.txt)). Not compatible with `-since`. Example: `-start 2015-12-24T08:12:23Z`
- `[ -end <timestamp> ]` ]: Exclude all results after the specified timestamp ([RFC3339 format](https://www.ietf.org/rfc/rfc3339.txt)). Not compatible with `-since`. If used without `-start`, all data will be backed up starting from 1970-01-01. Example: `-end 2015-12-31T08:12:23Z`
- `[ -since <timestamp> ]`: Perform an incremental backup after the specified timestamp [RFC3339 format](https://www.ietf.org/rfc/rfc3339.txt). Use `-start` instead, unless needed for legacy backup support.
#### Backup examples
**To back up everything:**
```
influxd backup -portable <path-to-backup>
```
**To backup all databases recently changed at the filesystem level**
```
influxd backup -portable -start <timestamp> <path-to-backup>
```
**To backup only the `telegraf` database:**
```
influxd backup -portable -database telegraf <path-to-backup>
```
**To backup a database for a specified time interval:**
```
influxd backup -portable -database mytsd -start 2017-04-28T06:49:00Z -end 2017-04-28T06:50:00Z /tmp/backup/influxdb
```
### `restore`
An online `restore` process is initiated by using the `restore` command with either the `-portable` argument (indicating the new Enterprise-compatible backup format) or `-online` flag (indicating the legacy backup format).
```
influxd restore [ -db <db_name> ]
-portable | -online
[ -host <host:port> ]
[ -newdb <newdb_name> ]
[ -rp <rp_name> ]
[ -newrp <newrp_name> ]
[ -shard <shard_ID> ]
<path-to-backup-files>
```
{{% warn %}}
Restoring backups that specified time periods (using `-start` and `-end`)
Backups that specified time intervals using the `-start` or `-end` arguments are performed on blocks of data and not on a point-by-point basis. Since most blocks are highly compacted, extracting each block to inspect each point creates both a computational and disk-space burden on the running system.
Each data block is annotated with starting and ending timestamps for the time interval included in the block. When you specify `-start` or `-end` timestamps, all of the specified data is backed up, but other data points that are in the same blocks will also be backed up.
**Expected behavior**
- When restoring data, you are likely to see data that is outside of the specified time periods.
- If duplicate data points are included in the backup files, the points will be written again, overwriting any existing data.
{{% /warn %}}
#### Arguments
Optional arguments are enclosed in brackets.
- `-portable`: Use the new Enterprise-compatible backup format for InfluxDB OSS. Recommended instead of `-online`. A backup created on InfluxDB Enterprise can be restored to an InfluxDB OSS instance.
- `-online`: Use the legacy backup format. Only use if the newer `-portable` option cannot be used.
- `[ -host <host:port> ]`: Host and port for InfluxDB OSS instance . Default value is `'127.0.0.1:8088'`. Required for remote connections. Example: `-host 127.0.0.1:8088`
- `[ -db <db_name> | -database <db_name> ]`: Name of the database to be restored from the backup. If not specified, all databases will be restored.
- `[ -newdb <newdb_name> ]`: Name of the database into which the archived data will be imported on the target system. If not specified, then the value for `-db` is used. The new database name must be unique to the target system.
- `[ -rp <rp_name> ]`: Name of the retention policy from the backup that will be restored. Requires that `-db` is set. If not specified, all retention policies will be used.
- `[ -newrp <newrp_name> ]`: Name of the retention policy to be created on the target system. Requires that `-rp` is set. If not specified, then the `-rp` value is used.
- `[ -shard <shard_ID> ]`: Shard ID of the shard to be restored. If specified, then `-db` and `-rp` are required.
> **Note:** If you have automated backups based on the legacy format, consider using the new online feature for your legacy backups. The new backup utility lets you restore a single database to a live (online) instance, while leaving all existing data on the server in place. The [offline restore method (described below)](#restore-legacy) may result in data loss, since it clears all existing databases on the server.
#### Restore examples
**To restore all databases found within the backup directory:**
```
influxd restore -portable path-to-backup
```
**To restore only the `telegraf` database (telegraf database must not exist):**
```
influxd restore -portable -db telegraf path-to-backup
```
**To restore data to a database that already exists:**
You cannot restore directly into a database that already exists. If you attempt to run the `restore` command into an existing database, you will get a message like this:
```
influxd restore -portable -db existingdb path-to-backup
2018/08/30 13:42:46 error updating meta: DB metadata not changed. database may already exist
restore: DB metadata not changed. database may already exist
```
1. Restore the existing database backup to a temporary database.
```
influxd restore -portable -db telegraf -newdb telegraf_bak path-to-backup
```
2. Sideload the data (using a `SELECT ... INTO` statement) into the existing target database and drop the temporary database.
```
> USE telegraf_bak
> SELECT * INTO telegraf..:MEASUREMENT FROM /.*/ GROUP BY *
> DROP DATABASE telegraf_bak
```
**To restore to a retention policy that already exists:**
1. Restore the retention policy to a temporary database.
```
influxd restore -portable -db telegraf -newdb telegraf_bak -rp autogen -newrp autogen_bak path-to-backup
```
2. Sideload into the target database and drop the temporary database.
```
> USE telegraf_bak
> SELECT * INTO telegraf.autogen.:MEASUREMENT FROM /telegraf_bak.autogen_bak.*/ GROUP BY *
> DROP DATABASE telegraf_bak
```
### Backward compatible offline backup and restore (legacy format)
> ***Note:*** The backward compatible backup and restore for InfluxDB OSS documented below are deprecated. InfluxData recommends using the newer Enterprise-compatible backup and restore utilities with your InfluxDB OSS servers.
InfluxDB OSS has the ability to snapshot an instance at a point-in-time and restore it.
All backups are full backups; incremental backups are not supported.
Two types of data can be backed up, the metastore and the metrics themselves.
The [metastore](/influxdb/v1.8/concepts/glossary/#metastore) is backed up in its entirety.
The metrics are backed up on a per-database basis in an operation separate from the metastore backup.
#### Backing up the metastore
The InfluxDB metastore contains internal information about the status of
the system, including user information, database and shard metadata, continuous queries, retention policies, and subscriptions.
While a node is running, you can create a backup of your instance's metastore by running the command:
```
influxd backup <path-to-backup>
```
Where `<path-to-backup>` is the directory where you
want the backup to be written to. Without any other arguments,
the backup will only record the current state of the system
metastore. For example, the command:
```bash
$ influxd backup /tmp/backup
2016/02/01 17:15:03 backing up metastore to /tmp/backup/meta.00
2016/02/01 17:15:03 backup complete
```
Will create a metastore backup in the directory `/tmp/backup` (the
directory will be created if it doesn't already exist).
#### Backup (legacy)
Each database must be backed up individually.
To backup a database, add the `-database` flag:
```bash
influxd backup -database <mydatabase> <path-to-backup>
```
Where `<mydatabase>` is the name of the database you would like to
backup, and `<path-to-backup>` is where the backup data should be
stored.
Optional flags also include:
- `-retention <retention-policy-name>`
- This flag can be used to backup a specific retention policy. For more information on retention policies, see
[Retention policy management](/influxdb/v1.8/query_language/database_management/#retention-policy-management). If unspecified, all retention policies will be backed up.
- `-shard <shard ID>` - This flag can be used to backup a specific
shard ID. To see which shards are available, you can run the command
`SHOW SHARDS` using the InfluxDB query language. If not specified,
all shards will be backed up.
- `-since <date>` - This flag can be used to create a backup _since_ a
specific date, where the date must be in
[RFC3339](https://www.ietf.org/rfc/rfc3339.txt) format (for example,
`2015-12-24T08:12:23Z`). This flag is important if you would like to
take incremental backups of your database. If not specified, all
timeranges within the database will be backed up.
> **Note:** Metastore backups are also included in per-database backups
As a real-world example, you can take a backup of the `autogen`
retention policy for the `telegraf` database since midnight UTC on
February 1st, 2016 by using the command:
```
$ influxd backup -database telegraf -retention autogen -since 2016-02-01T00:00:00Z /tmp/backup
2016/02/01 18:02:36 backing up rp=default since 2016-02-01 00:00:00 +0000 UTC
2016/02/01 18:02:36 backing up metastore to /tmp/backup/meta.01
2016/02/01 18:02:36 backing up db=telegraf rp=default shard=2 to /tmp/backup/telegraf.default.00002.01 since 2016-02-01 00:00:00 +0000 UTC
2016/02/01 18:02:36 backup complete
```
Which will send the resulting backup to `/tmp/backup`, where it can
then be compressed and sent to long-term storage.
#### Remote backups (legacy)
The legacy backup mode also supports live, remote backup functionality.
Follow the directions in [Configuring remote connections](#configuring-remote-connections) above to configure this feature.
## Restore (legacy)
{{% warn %}} This offline restore method described here may result in data loss -- it clears all existing databases on the server. Consider using the `-online` flag with the newer [`restore` method (described above)](#restore) to import legacy data without any data loss.
{{% /warn %}}
To restore a backup, you will need to use the `influxd restore` command.
> **Note:** Restoring from backup is only supported while the InfluxDB daemon is stopped.
To restore from a backup you will need to specify the type of backup,
the path to where the backup should be restored, and the path to the backup.
The command:
```
influxd restore [ -metadir | -datadir ] <path-to-meta-or-data-directory> <path-to-backup>
```
The required flags for restoring a backup are:
- `-metadir <path-to-meta-directory>` - This is the path to the meta
directory where you would like the metastore backup recovered
to. For packaged installations, this should be specified as
`/var/lib/influxdb/meta`.
- `-datadir <path-to-data-directory>` - This is the path to the data
directory where you would like the database backup recovered to. For
packaged installations, this should be specified as
`/var/lib/influxdb/data`.
The optional flags for restoring a backup are:
- `-database <database>` - This is the database that you would like to
restore the data to. This option is required if no `-metadir` option
is provided.
- `-retention <retention policy>` - This is the target retention policy
for the stored data to be restored to.
- `-shard <shard id>` - This is the shard data that should be
restored. If specified, `-database` and `-retention` must also be
set.
Following the backup example above, the backup can be restored in two
steps.
1. The metastore needs to be restored so that InfluxDB
knows which databases exist:
```
$ influxd restore -metadir /var/lib/influxdb/meta /tmp/backup
Using metastore snapshot: /tmp/backup/meta.00
```
2. Once the metastore has been restored, we can now recover the backed up
data. In the real-world example above, we backed up the `telegraf`
database to `/tmp/backup`, so let's restore that same dataset. To
restore the `telegraf` database:
```
$ influxd restore -database telegraf -datadir /var/lib/influxdb/data /tmp/backup
Restoring from backup /tmp/backup/telegraf.*
unpacking /var/lib/influxdb/data/telegraf/default/2/000000004-000000003.tsm
unpacking /var/lib/influxdb/data/telegraf/default/2/000000005-000000001.tsm
```
> **Note:** Once the backed up data has been recovered, the permissions on the shards may no longer be accurate. To ensure the file permissions are correct, please run this command: `$ sudo chown -R influxdb:influxdb /var/lib/influxdb`
Once the data and metastore are recovered, start the database:
```bash
$ service influxdb start
```
As a quick check, you can verify that the database is known to the metastore
by running a `SHOW DATABASES` command:
```
influx -execute 'show databases'
name: databases
---------------
name
_internal
telegraf
```
The database has now been successfully restored!

View File

@ -0,0 +1,36 @@
---
title: Compact a series file offline
description: >
Use the `influx_inspect buildtsi -compact-series-file` command to compact your
series file and reduce its size on disk.
menu:
influxdb_1_8:
weight: 67
parent: Administration
name: Compact a series file
---
Use the **Series File Compaction tool** to compact your series file and reduce its size on disk.
This is useful for series files that grow quickly, for example, when series are frequently created and deleted.
**To compact a series file:**
1. Stop the `influxd` process.
2. Run the following command, including your **data directory** and **WAL directory**:
```sh
# Syntax
influx_inspect buildtsi -compact-series-file -datadir <data_dir> -waldir <wal_dir>
# Example
influx_inspect buildtsi -compact-series-file -datadir /data -waldir /wal
```
3. Restart the `influxd` process.
4. **_(InfluxDB Enterprise only)_** On each data node:
1. Complete steps 1-3.
2. Wait for the [hinted handoff queue (HHQ)](/enterprise_influxdb/latest/concepts/clustering/#hinted-handoff)
to write all missed data to the node.
3. Continue to the next data node.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,241 @@
---
title: Enable HTTPS with InfluxDB
description: Enable HTTPS and Transport Security Layer (TLS) secure communication between clients and your InfluxDB servers.
menu:
influxdb_1_8:
name: Enable HTTPS
weight: 30
parent: Administration
---
Enabling HTTPS encrypts the communication between clients and the InfluxDB server.
When configured with a signed certificate, HTTPS can also verify the authenticity of the InfluxDB server to connecting clients.
InfluxData [strongly recommends](/influxdb/v1.8/administration/security/) enabling HTTPS, especially if you plan on sending requests to InfluxDB over a network.
## Requirements
To enable HTTPS with InfluxDB, you'll need an existing or new InfluxDB instance
and a Transport Layer Security (TLS) certificate (also known as a Secured Sockets Layer (SSL) certificate).
InfluxDB supports three types of TLS certificates:
* **Single domain certificates signed by a Certificate Authority**
Single domain certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
With this certificate option, every InfluxDB instance requires a unique single domain certificate.
* **Wildcard certificates signed by a Certificate Authority**
Wildcard certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server.
Wildcard certificates can be used across multiple InfluxDB instances on different servers.
* **Self-signed certificates**
Self-signed certificates are _not_ signed by a Certificate Authority (CA).
[Generate a self-signed certificate](#step-1-generate-a-self-signed-certificate) on your own machine.
Unlike CA-signed certificates, self-signed certificates only provide cryptographic security to HTTPS requests.
They do not allow clients to verify the identity of the InfluxDB server.
With this certificate option, every InfluxDB instance requires a unique self-signed certificate.
Regardless of your certificate's type, InfluxDB supports certificates composed of
a private key file (`.key`) and a signed certificate file (`.crt`) file pair, as well as certificates
that combine the private key file and the signed certificate file into a single bundled file (`.pem`).
The following two sections outline how to set up HTTPS with InfluxDB [using a CA-signed
certificate](#set-up-https-with-a-ca-certificate) and [using a self-signed certificate](#set-up-https-with-a-self-signed-certificate)
on Ubuntu 16.04.
Steps may vary for other operating systems.
## Set up HTTPS with a CA certificate
#### Step 1: Install the certificate
Place the private key file (`.key`) and the signed certificate file (`.crt`)
or the single bundled file (`.pem`) in the `/etc/ssl` directory.
#### Step 2: Set certificate file permissions
Users running InfluxDB must have read permissions on the TLS certificate.
>***Note***: You may opt to set up multiple users, groups, and permissions. Ultimately, make sure all users running InfluxDB have read permissions for the TLS certificate.
Run the following command to give InfluxDB read and write permissions on the certificate files.
```bash
sudo chown influxdb:influxdb /etc/ssl/<CA-certificate-file>
sudo chmod 644 /etc/ssl/<CA-certificate-file>
sudo chmod 600 /etc/ssl/<private-key-file>
```
#### Step 3: Review the TLS configuration settings
By default, InfluxDB supports the values for TLS `ciphers`, `min-version`, and `max-version` listed in the [Constants section of the Go `crypto/tls` package documentation](https://golang.org/pkg/crypto/tls/#pkg-constants) and depends on the version of Go used to build InfluxDB. You can configure InfluxDB to support a restricted list of TLS cipher suite IDs and versions.
For more information, see [Transport Layer Security (TLS) configuration settings](/influxdb/v1.8/administration/config#transport-layer-security-tls-settings).
#### Step 4: Enable HTTPS in the InfluxDB configuration file
HTTPS is disabled by default.
Enable HTTPS in the `[http]` section of the configuration file (`/etc/influxdb/influxdb.conf`) by setting:
* `https-enabled` to `true`
* `https-certificate` to `/etc/ssl/<signed-certificate-file>.crt` (or to `/etc/ssl/<bundled-certificate-file>.pem`)
* `https-private-key` to `/etc/ssl/<private-key-file>.key` (or to `/etc/ssl/<bundled-certificate-file>.pem`)
```toml
[http]
[...]
# Determines whether HTTPS is enabled.
https-enabled = true
[...]
# The SSL certificate to use when HTTPS is enabled.
https-certificate = "<bundled-certificate-file>.pem"
# Use a separate private key location.
https-private-key = "<bundled-certificate-file>.pem"
```
#### Step 5: Restart the InfluxDB service
Restart the InfluxDB process for the configuration changes to take effect:
```bash
sudo systemctl restart influxdb
```
#### Step 6: Verify the HTTPS setup
Verify that HTTPS is working by connecting to InfluxDB with the [CLI tool](/influxdb/v1.8/tools/shell/):
```bash
influx -ssl -host <domain_name>.com
```
A successful connection returns the following:
```bash
Connected to https://<domain_name>.com:8086 version 1.x.x
InfluxDB shell version: 1.x.x
>
```
That's it! You've successfully set up HTTPS with InfluxDB.
## Set up HTTPS with a self-signed certificate
#### Step 1: Generate a self-signed certificate
The following command generates a private key file (`.key`) and a self-signed
certificate file (`.crt`) which remain valid for the specified `NUMBER_OF_DAYS`.
It outputs those files to the InfluxDB database's default certificate file paths and gives them
the required permissions.
```bash
sudo openssl req -x509 -nodes -newkey rsa:2048 -keyout /etc/ssl/influxdb-selfsigned.key -out /etc/ssl/influxdb-selfsigned.crt -days <NUMBER_OF_DAYS>
```
When you execute the command, it will prompt you for more information.
You can choose to fill out that information or leave it blank;
both actions generate valid certificate files.
Run the following command to give InfluxDB read and write permissions on the certificate.
```bash
chown influxdb:influxdb /etc/ssl/influxdb-selfsigned.*
```
#### Step 2: Review the TLS configuration settings
By default, InfluxDB supports the values for TLS `ciphers`, `min-version`, and `max-version` listed in the [Constants section of the Go `crypto/tls` package documentation](https://golang.org/pkg/crypto/tls/#pkg-constants) and depends on the version of Go used to build InfluxDB. You can configure InfluxDB to support a restricted list of TLS cipher suite IDs and versions. For more information, see [Transport Layer Security (TLS) settings `[tls]`](/influxdb/v1.8/administration/config#transport-layer-security-tls-settings).
#### Step 3: Enable HTTPS in the configuration file
HTTPS is disabled by default.
Enable HTTPS in the `[http]` section of the configuration file (`/etc/influxdb/influxdb.conf`) by setting:
* `https-enabled` to `true`
* `https-certificate` to `/etc/ssl/influxdb-selfsigned.crt`
* `https-private-key` to `/etc/ssl/influxdb-selfsigned.key`
```
[http]
[...]
# Determines whether HTTPS is enabled.
https-enabled = true
[...]
# The TLS or SSL certificate to use when HTTPS is enabled.
https-certificate = "/etc/ssl/influxdb-selfsigned.crt"
# Use a separate private key location.
https-private-key = "/etc/ssl/influxdb-selfsigned.key"
```
> If setting up HTTPS for [InfluxDB Enterprise](/enterprise_influxdb), you also need to configure insecure TLS connections between both meta and data nodes in your cluster.
> Instructions are provided in the [InfluxDB Enterprise HTTPS Setup guide](/enterprise_influxdb/latest/guides/https_setup/#set-up-https-with-a-self-signed-certificate).
#### Step 4: Restart InfluxDB
Restart the InfluxDB process for the configuration changes to take effect:
```bash
sudo systemctl restart influxdb
```
#### Step 5: Verify the HTTPS setup
Verify that HTTPS is working by connecting to InfluxDB with the [CLI tool](/influxdb/v1.8/tools/shell/):
```bash
influx -ssl -unsafeSsl -host <domain_name>.com
```
A successful connection returns the following:
```bash
Connected to https://<domain_name>.com:8086 version 1.x.x
InfluxDB shell version: 1.x.x
>
```
That's it! You've successfully set up HTTPS with InfluxDB.
## Connect Telegraf to a secured InfluxDB instance
Connecting [Telegraf](/telegraf/latest/) to an InfluxDB instance that's using
HTTPS requires some additional steps.
In the Telegraf configuration file (`/etc/telegraf/telegraf.conf`), edit the `urls`
setting to indicate `https` instead of `http` and change `localhost` to the
relevant domain name.
If you're using a self-signed certificate, uncomment the `insecure_skip_verify`
setting and set it to `true`.
```toml
###############################################################################
# OUTPUT PLUGINS #
###############################################################################
>
# Configuration for InfluxDB server to send metrics to
[[outputs.influxdb]]
## The full HTTP or UDP endpoint URL for your InfluxDB instance.
## Multiple urls can be specified as part of the same cluster,
## this means that only ONE of the urls will be written to each interval.
# urls = ["udp://localhost:8089"] # UDP endpoint example
urls = ["https://<domain_name>.com:8086"]
>
[...]
>
## Optional SSL Config
[...]
insecure_skip_verify = true # <-- Update only if you're using a self-signed certificate
```
Next, restart Telegraf and you're all set!

View File

@ -0,0 +1,351 @@
---
title: Log and trace with InfluxDB
menu:
influxdb_1_8:
name: Log and trace
weight: 40
parent: Administration
---
**Content**
* [Logging locations](#logging-locations)
* [HTTP access logging](#http-access-logging)
* [Structured logging](#structured-logging)
* [Tracing](#tracing)
## Logging locations
InfluxDB writes log output, by default, to `stderr`.
Depending on your use case, this log information can be written to another location.
### Running InfluxDB directly
When you run InfluxDB directly, using `influxd`, all logs are written to `stderr`.
You can also redirect the log output, as you would any output to `stderr`, like in this example:
```
influxd 2>$HOME/my_log_file
```
### Launched as a service
<!--------------------------- BEGIN TABS ---------------------->
{{< tabs-wrapper >}}
{{% tabs %}}
[systemd](#)
[sysinit](#)
{{% /tabs %}}
<!--------------------------- BEGIN systemd ------------------->
{{% tab-content %}}
#### systemd
Most Linux systems direct logs to the `systemd` journal.
To access these logs, use this command:
```sh
sudo journalctl -u influxdb.service
```
For more information, see the [`journald.conf` manual page](https://www.freedesktop.org/software/systemd/man/journald.conf.html).
{{% /tab-content %}}
<!--------------------------- END systemd --------------------->
<!--------------------------- BEGIN sysvinit ------------------>
{{% tab-content %}}
#### sysvinit
On Linux sytems not using systemd, InfluxDB writes all log data and `stderr` to `/var/log/influxdb/influxd.log`.
You can override this location by setting the environment variable `STDERR` in a start-up script at `/etc/default/influxdb`.
(If this file doesn't exist, you need to create it.)
For example, if `/etc/default/influxdb` contains:
```sh
STDERR=/dev/null
```
all log data is discarded.
Likewise, you can direct output to `stdout` by setting `STDOUT` in the same file.
`stdout` is sent to `/dev/null` by default when InfluxDB is launched as a service.
InfluxDB must be restarted to use any changes to `/etc/default/influxdb`.
{{% /tab-content %}}
<!--------------------------- END sysvinit --------------------->
{{< /tabs-wrapper >}}
<!--------------------------- END TABS ------------------------>
> #### Log location on macOS
> On macOs, InfluxDB stores logs at `/usr/local/var/log/influxdb.log` by default.
### Using logrotate
You can use [logrotate](http://manpages.ubuntu.com/manpages/cosmic/en/man8/logrotate.8.html) to rotate the log files generated by InfluxDB on systems where logs are written to flat files.
If using the package install on a `sysvinit` system, the config file for logrotate is installed in `/etc/logrotate.d`.
You can view the file [here](https://github.com/influxdb/influxdb/blob/1.8/scripts/logrotate).
## HTTP access logging
Use the HTTP access log to log HTTP request traffic separately from the other InfluxDB log output.
### HTTP access log format
The following is an example of the HTTP access log format. The table below describes each component of the HTTP access log.
```
172.13.8.13,172.39.5.169 - - [21/Jul/2019:03:01:27 +0000] "GET /query?db=metrics&q=SELECT+MEAN%28value%29+as+average_cpu%2C+MAX%28value%29+as+peak_cpu+FROM+%22foo.load%22+WHERE+time+%3E%3D+now%28%29+-+1m+AND+org_id+%21%3D+%27%27+AND+group_id+%21%3D+%27%27+GROUP+BY+org_id%2Cgroup_id HTTP/1.0" 200 11450 "-" "Baz Service" d6ca5a13-at63-11o9-8942-000000000000 9337349
```
| Component | Example |
|--- |--- |
|Host |`172.13.8.13,172.39.5.169` |
|Time of log event |`[21/Jul/2019:03:01:27 +0000]` |
|Request method |`GET` |
|Username |`user` |
|HTTP API call being made&ast; |`/query?db=metrics%26q=SELECT%20used_percent%20FROM%20%22telegraf.autogen.mem%22%20WHERE%20time%20%3E=%20now()%20-%201m%20 ` |
|Request protocol |`HTTP/1.0` |
|HTTP response code |`200` |
|Size of response in bytes |`11450` |
|Referrer |`-` |
|User agent |`Baz Service` |
|Request ID |`d4ca9a10-ab63-11e9-8942-000000000000` |
|Response time in microseconds |`9357049` |
&ast; This field shows the database being acessed and the query being run. For more details, see [InfluxDB API reference](/influxdb/v1.8/tools/api/). Note that this field is URL-encoded.
### Redirecting HTTP access logging
When HTTP request logging is enabled, the HTTP logs are intermingled by default with internal InfluxDB logging. By redirecting the HTTP request log entries to a separate file, both log files are easier to read, monitor, and debug.
**To redirect HTTP request logging:**
Locate the `[http]` section of your InfluxDB configuration file and set the `access-log-path` option to specify the path where HTTP log entries should be written.
**Notes:**
* If `influxd` is unable to access the specified path, it will log an error and fall back to writing the request log to `stderr`.
* The `[httpd]` prefix is stripped when HTTP request logging is redirected to a separate file, allowing access log parsing tools (like [lnav](https://lnav.org)) to render the files without additional modification.
* To rotate the HTTP request log file, use the `copytruncate` method of `logrotate` or similar to leave the original file in place.
## Structured logging
Structured logging enables machine-readable and more developer-friendly log output formats. The two structured log formats, `logfmt` and `json`, provide easier filtering and searching with external tools and simplifies integration of InfluxDB logs with Splunk, Papertrail, Elasticsearch, and other third party tools.
The InfluxDB logging configuration options (in the `[logging]` section) now include the following options:
* `format`: `auto` (default) | `logfmt` | `json`
* `level`: `error` | `warn` | `info` (default) | `debug`
* `suppress-logo`: `false` (default) | `true`
For details on these logging configuration options and their corresponding environment variables, see [Logging options](/influxdb/v1.8/administration/config#logging-settings) in the configuration file documentation.
### Logging formats
Three logging `format` options are available: `auto`, `logfmt`, and `json`. The default logging format setting, `format = "auto"`, lets InfluxDB automatically manage the log encoding format:
* When logging to a file, the `logfmt` is used
* When logging to a terminal (or other TTY device), a user-friendly console format is used.
The `json` format is available when specified.
### Examples of log output:
**Logfmt**
```
ts=2018-02-20T22:48:11.291815Z lvl=info msg="InfluxDB starting" version=unknown branch=unknown commit=unknown
ts=2018-02-20T22:48:11.291858Z lvl=info msg="Go runtime" version=go1.10 maxprocs=8
ts=2018-02-20T22:48:11.291875Z lvl=info msg="Loading configuration file" path=/Users/user_name/.influxdb/influxdb.conf
```
**JSON**
```
{"lvl":"info","ts":"2018-02-20T22:46:35Z","msg":"InfluxDB starting, version unknown, branch unknown, commit unknown"}
{"lvl":"info","ts":"2018-02-20T22:46:35Z","msg":"Go version go1.10, GOMAXPROCS set to 8"}
{"lvl":"info","ts":"2018-02-20T22:46:35Z","msg":"Using configuration at: /Users/user_name/.influxdb/influxdb.conf"}
```
**Console/TTY**
```
2018-02-20T22:55:34.246997Z info InfluxDB starting {"version": "unknown", "branch": "unknown", "commit": "unknown"}
2018-02-20T22:55:34.247042Z info Go runtime {"version": "go1.10", "maxprocs": 8}
2018-02-20T22:55:34.247059Z info Loading configuration file {"path": "/Users/user_name/.influxdb/influxdb.conf"}
```
### Logging levels
The `level` option sets the log level to be emitted. Valid logging level settings are `error`, `warn`, `info`(default), and `debug`. Logs that are equal to, or above, the specified level will be emitted.
### Logo suppression
The `suppress-logo` option can be used to suppress the logo output that is printed when the program is started. The logo is always suppressed if `STDOUT` is not a TTY.
## Tracing
Logging has been enhanced to provide tracing of important InfluxDB operations. Tracing is useful for error reporting and discovering performance bottlenecks.
### Logging keys used in tracing
#### Tracing identifier key
The `trace_id` key specifies a unique identifier for a specific instance of a trace. You can use this key to filter and correlate all related log entries for an operation.
All operation traces include consistent starting and ending log entries, with the same message (`msg`) describing the operation (e.g., "TSM compaction"), but adding the appropriate `op_event` context (either `start` or `end`). For an example, see [Finding all trace log entries for an InfluxDB operation](#finding-all-trace-log-entries-for-an-influxdb-operation).
**Example:** `trace_id=06R0P94G000`
#### Operation keys
The following operation keys identify an operation's name, the start and end timestamps, and the elapsed execution time.
##### `op_name`
Unique identifier for an operation. You can filter on all operations of a specific name.
**Example:** `op_name=tsm1_compact_group`
##### `op_event`
Specifies the start and end of an event. The two possible values, `(start)` or `(end)`, are used to indicate when an operation started or ended. For example, you can grep by values in `op_name` AND `op_event` to find all starting operation log entries. For an example of this, see [Finding all starting log entries](#finding-all-starting-operation-log-entries).
**Example:** `op_event=start`
##### `op_elapsed`
Amount of time the operation spent executing. Logged with the ending trace log entry. Time unit displayed depends on how much time has elapsed -- if it was seconds, it will be suffixed with `s`. Valid time units are `ns`, `µs`, `ms`, and `s`.
**Example:** `op_elapsed=0.352ms`
#### Log identifier context key
The log identifier key (`log_id`) lets you easily identify _every_ log entry for a single execution of an `influxd` process. There are other ways a log file could be split by a single execution, but the consistent `log_id` eases the searching of log aggregation services.
**Example:** `log_id=06QknqtW000`
#### Database context keys
`db_instance`: Database name
`db_rp`: Retention policy name
`db_shard_id`: Shard identifier
`db_shard_group` Shard group identifier
### Tooling
Here are a couple of popular tools available for processing and filtering log files output in `logfmt` or `json` formats.
#### [hutils](https://blog.heroku.com/hutils-explore-your-structured-data-logs)
The [hutils](https://blog.heroku.com/hutils-explore-your-structured-data-logs), provided by Heroku, is a collection of command line utilities for working with logs with `logfmt` encoding, including:
* `lcut`: Extracts values from a `logfmt` trace based on a specified field name.
* `lfmt`: Prettifies `logfmt` lines as they emerge from a stream, and highlights their key sections.
* `ltap`: Accesses messages from log providers in a consistent way to allow easy parsing by other utilities that operate on `logfmt` traces.
* `lviz`: Visualizes `logfmt` output by building a tree out of a dataset combining common sets of key-value pairs into shared parent nodes.
#### [lnav (Log File Navigator)](http://lnav.org)
The [lnav (Log File Navigator)](http://lnav.org) is an advanced log file viewer useful for watching and analyzing your log files from a terminal. The lnav viewer provides a single log view, automatic log format detection, filtering, timeline view, pretty-print view, and querying logs using SQL.
### Operations
The following operations, listed by their operation name (`op_name`) are traced in InfluxDB internal logs and available for use without changes in logging level.
#### Initial opening of data files
The `tsdb_open` operation traces include all events related to the initial opening of the `tsdb_store`.
#### Retention policy shard deletions
The `retention.delete_check` operation includes all shard deletions related to the retention policy.
#### TSM snapshotting in-memory cache to disk
The `tsm1_cache_snapshot` operation represents the snapshotting of the TSM in-memory cache to disk.
#### TSM compaction strategies
The `tsm1_compact_group` operation includes all trace log entries related to TSM compaction strategies and displays the related TSM compaction strategy keys:
* `tsm1_strategy`: `level` | `full`
* `tsm1_level`: `1` | `2` | `3`
* `tsm1_optimize`: `true` | `false`
#### Series file compactions
The `series_partition_compaction` operation includes all trace log entries related to series file compactions.
#### Continuous query execution (if logging enabled)
The `continuous_querier_execute` operation includes all continuous query executions, if logging is enabled.
#### TSI log file compaction
The `tsi1_compact_log_file`
#### TSI level compaction
The `tsi1_compact_to_level` operation includes all trace log entries for TSI level compactions.
### Tracing examples
#### Finding all trace log entries for an InfluxDB operation
In the example below, you can see the log entries for all trace operations related to a "TSM compaction" process. Note that the initial entry shows the message "TSM compaction (start)" and the final entry displays the message "TSM compaction (end)". \[Note: Log entries were grepped using the `trace_id` value and then the specified key values were displayed using `lcut` (an hutils tool).\]
```
$ grep "06QW92x0000" influxd.log | lcut ts lvl msg strategy level
2018-02-21T20:18:56.880065Z info TSM compaction (start) full
2018-02-21T20:18:56.880162Z info Beginning compaction full
2018-02-21T20:18:56.880185Z info Compacting file full
2018-02-21T20:18:56.880211Z info Compacting file full
2018-02-21T20:18:56.880226Z info Compacting file full
2018-02-21T20:18:56.880254Z info Compacting file full
2018-02-21T20:19:03.928640Z info Compacted file full
2018-02-21T20:19:03.928687Z info Finished compacting files full
2018-02-21T20:19:03.928707Z info TSM compaction (end) full
```
#### Finding all starting operation log entries
To find all starting operation log entries, you can grep by values in `op_name` AND `op_event`. In the following example, the grep returned 101 entries, so the result below only displays the first entry. In the example result entry, the timestamp, level, strategy, trace_id, op_name, and op_event values are included.
```
$ grep -F 'op_name=tsm1_compact_group' influxd.log | grep -F 'op_event=start'
ts=2018-02-21T20:16:16.709953Z lvl=info msg="TSM compaction" log_id=06QVNNCG000 engine=tsm1 level=1 strategy=level trace_id=06QV~HHG000 op_name=tsm1_compact_group op_event=start
...
```
Using the `lcut` utility (in hutils), the following command uses the previous `grep` command, but adds an `lcut` command to only display the keys and their values for keys that are not identical in all of the entries. The following example includes 19 examples of unique log entries displaying selected keys: `ts`, `strategy`, `level`, and `trace_id`.
```
$ grep -F 'op_name=tsm1_compact_group' influxd.log | grep -F 'op_event=start' | lcut ts strategy level trace_id | sort -u
2018-02-21T20:16:16.709953Z level 1 06QV~HHG000
2018-02-21T20:16:40.707452Z level 1 06QW0k0l000
2018-02-21T20:17:04.711519Z level 1 06QW2Cml000
2018-02-21T20:17:05.708227Z level 2 06QW2Gg0000
2018-02-21T20:17:29.707245Z level 1 06QW3jQl000
2018-02-21T20:17:53.711948Z level 1 06QW5CBl000
2018-02-21T20:18:17.711688Z level 1 06QW6ewl000
2018-02-21T20:18:56.880065Z full 06QW92x0000
2018-02-21T20:20:46.202368Z level 3 06QWFizW000
2018-02-21T20:21:25.292557Z level 1 06QWI6g0000
2018-02-21T20:21:49.294272Z level 1 06QWJ_RW000
2018-02-21T20:22:13.292489Z level 1 06QWL2B0000
2018-02-21T20:22:37.292431Z level 1 06QWMVw0000
2018-02-21T20:22:38.293320Z level 2 06QWMZqG000
2018-02-21T20:23:01.293690Z level 1 06QWNygG000
2018-02-21T20:23:25.292956Z level 1 06QWPRR0000
2018-02-21T20:24:33.291664Z full 06QWTa2l000
2018-02-21T21:12:08.017055Z full 06QZBpKG000
2018-02-21T21:12:08.478200Z full 06QZBr7W000
```

View File

@ -0,0 +1,59 @@
---
title: InfluxDB ports
menu:
influxdb_1_8:
name: Ports
weight: 50
parent: Administration
---
## Enabled ports
### `8086`
The default port that runs the InfluxDB HTTP service.
[Configure this port](/influxdb/v1.8/administration/config#bind-address-8086)
in the configuration file.
**Resources** [API Reference](/influxdb/v1.8/tools/api/)
### 8088
The default port used by the RPC service for RPC calls made by the CLI for backup and restore operations (`influxdb backup` and `influxd restore`).
[Configure this port](/influxdb/v1.8/administration/config#bind-address-127-0-0-1-8088)
in the configuration file.
**Resources** [Backup and Restore](/influxdb/v1.8/administration/backup_and_restore/)
## Disabled ports
### 2003
The default port that runs the Graphite service.
[Enable and configure this port](/influxdb/v1.8/administration/config#bind-address-2003)
in the configuration file.
**Resources** [Graphite README](https://github.com/influxdata/influxdb/tree/1.8/services/graphite/README.md)
### 4242
The default port that runs the OpenTSDB service.
[Enable and configure this port](/influxdb/v1.8/administration/config#bind-address-4242)
in the configuration file.
**Resources** [OpenTSDB README](https://github.com/influxdata/influxdb/tree/1.8/services/opentsdb/README.md)
### 8089
The default port that runs the UDP service.
[Enable and configure this port](/influxdb/v1.8/administration/config#bind-address-8089)
in the configuration file.
**Resources** [UDP README](https://github.com/influxdata/influxdb/tree/1.8/services/udp/README.md)
### 25826
The default port that runs the Collectd service.
[Enable and configure this port](/influxdb/v1.8/administration/config#bind-address-25826)
in the configuration file.
**Resources** [Collectd README](https://github.com/influxdata/influxdb/tree/1.8/services/collectd/README.md)

View File

@ -0,0 +1,52 @@
---
title: Rebuild the TSI index
description: >
Rebuild your InfluxDB TSI index using the `influxd_inspect buildtsi` command.
menu:
influxdb_1_8:
weight: 60
parent: Administration
---
The InfluxDB [Time Series Index (TSI)](/influxdb/v1.8/concepts/tsi-details/)
indexes or caches measurement and tag data to ensure queries are performant.
In some cases, it may be necessary to flush and rebuild the TSI Index.
Use the following steps to rebuild your InfluxDB TSI index:
## 1. Stop InfluxDB
Stop InfluxDB by stopping the `influxd` process.
## 2. Remove all `_series` directories
Remove all `_series` directories.
By default, `_series` directories are are stored at `/data/<dbName>/_series`,
however you should check for and remove `_series` files throughout the `/data` directory.
## 3. Remove all index directories
Remove all index directories.
By default, index directories are stored at `/data/<dbName/<rpName>/<shardID>/index`.
## 4. Rebuild the TSI index
Use the [`influx_inspect` command line client (CLI)](/influxdb/v1.8/tools/influx_inspect)
to rebuild the TSI index:
```sh
# Syntax
influx_inspect buildtsi -datadir <data_dir> -waldir <wal_dir>
# Example
influx_inspect buildtsi -datadir /data -waldir /wal
```
## 5. Restart InfluxDB
Restart InfluxDB by starting the `influxd` process.
---
{{% note %}}
## Rebuild the TSI index in an InfluxDB Enterprise cluster
To rebuild the TSI index in an InfluxDB Enterprise cluster, perform the steps
above on each data node in the cluster one after the other.
After restarting the `influxd` process on a data node, allow the
[hinted handoff queue (HHQ)](/enterprise_influxdb/latest/concepts/clustering/#hinted-handoff)
to write all missed data to the updated node before moving on to the next node.
{{% /note %}}

View File

@ -0,0 +1,59 @@
---
title: Manage InfluxDB security
menu:
influxdb_1_8:
name: Manage security
weight: 70
parent: Administration
---
Some customers may choose to install InfluxDB with public internet access, however
doing so can inadvertently expose your data and invite unwelcome attacks on your database.
Check out the sections below for how protect the data in your InfluxDB instance.
## Enabling authentication
Password protect your InfluxDB instance to keep any unauthorized individuals
from accessing your data.
Resources:
[Set up Authentication](/influxdb/v1.8/administration/authentication_and_authorization/#set-up-authentication)
## Managing users and permissions
Restrict access by creating individual users and assigning them relevant
read and/or write permissions.
Resources:
[User Types and Privileges](/influxdb/v1.8/administration/authentication_and_authorization/#user-types-and-privileges),
[User Management Commands](/influxdb/v1.8/administration/authentication_and_authorization/#user-management-commands)
## Enabling HTTPS
Enabling HTTPS encrypts the communication between clients and the InfluxDB server.
HTTPS can also verify the authenticity of the InfluxDB server to connecting clients.
Resources:
[Enabling HTTPS](/influxdb/v1.8/administration/https_setup/)
## Configure security headers
HTTP headers allow servers and clients to pass additional information along with requests.
Certain headers help [enforce security](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers#Security) properties.
Resources:
[Configure HTTP headers](/influxdb/v1.8/administration/config/#http-headers)
## Securing your host
### Ports
If you're only running InfluxDB, close all ports on the host except for port `8086`.
You can also use a proxy to port `8086`.
InfluxDB uses port `8088` for remote [backups and restores](/influxdb/v1.8/administration/backup_and_restore/).
We highly recommend closing that port and, if performing a remote backup,
giving specific permission only to the remote machine.
### AWS recommendations
We recommend implementing on-disk encryption; InfluxDB does not offer built-in support to encrypt the data.

View File

@ -0,0 +1,166 @@
---
title: Monitor InfluxDB servers
aliases:
- /influxdb/v1.8/administration/statistics/
- /influxdb/v1.8/troubleshooting/statistics/
menu:
influxdb_1_8:
name: Monitor InfluxDB
weight: 80
parent: Administration
---
**On this page**
* [SHOW STATS](#show-stats)
* [SHOW DIAGNOSTICS](#show-diagnostics)
* [Internal monitoring](#internal-monitoring)
* [Useful performance metrics commands](#useful-performance-metrics-commands)
* [InfluxDB `/metrics` HTTP endpoint](#influxdb-metrics-http-endpoint)
InfluxDB can display statistical and diagnostic information about each node.
This information can be very useful for troubleshooting and performance monitoring.
## SHOW STATS
To see node statistics, execute the command `SHOW STATS`.
For details on this command, see [`SHOW STATS`](/influxdb/v1.8/query_language/spec#show-stats) in the InfluxQL specification.
The statistics returned by `SHOW STATS` are stored in memory only, and are reset to zero when the node is restarted.
## SHOW DIAGNOSTICS
To see node diagnostic information, execute the command `SHOW DIAGNOSTICS`.
This returns information such as build information, uptime, hostname, server configuration, memory usage, and Go runtime diagnostics.
For details on this command, see [`SHOW DIAGNOSTICS`](/influxdb/v1.8/query_language/spec#show-diagnostics) in the InfluxQL specification.
## Internal monitoring
InfluxDB also writes statistical and diagnostic information to database named `_internal`, which records metrics on the internal runtime and service performance.
The `_internal` database can be queried and manipulated like any other InfluxDB database.
Check out the [monitor service README](https://github.com/influxdata/influxdb/blob/1.8/monitor/README.md) and the [internal monitoring blog post](https://www.influxdata.com/blog/how-to-use-the-show-stats-command-and-the-_internal-database-to-monitor-influxdb/) for more detail.
## Useful performance metrics commands
Below are a collection of commands to find useful performance metrics about your InfluxDB instance.
To find the number of points per second being written to the instance. Must have the `monitor` service enabled:
```bash
$ influx -execute 'select derivative(pointReq, 1s) from "write" where time > now() - 5m' -database '_internal' -precision 'rfc3339'
```
To find the number of writes separated by database since the beginnning of the log file:
```bash
grep 'POST' /var/log/influxdb/influxd.log | awk '{ print $10 }' | sort | uniq -c
```
Or, for systemd systems logging to journald:
```bash
journalctl -u influxdb.service | awk '/POST/ { print $10 }' | sort | uniq -c
```
### InfluxDB `/metrics` HTTP endpoint
> ***Note:*** There are no outstanding PRs for improvements to the `/metrics` endpoint, but well add them to the CHANGELOG as they occur.
The InfluxDB `/metrics` endpoint is configured to produce the default Go metrics in Prometheus metrics format.
#### Example using InfluxDB `/metrics' endpoint
Below is an example of the output generated using the `/metrics` endpoint. Note that HELP is available to explain the Go statistics.
```
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 6.4134e-05
go_gc_duration_seconds{quantile="0.25"} 8.8391e-05
go_gc_duration_seconds{quantile="0.5"} 0.000131335
go_gc_duration_seconds{quantile="0.75"} 0.000169204
go_gc_duration_seconds{quantile="1"} 0.000544705
go_gc_duration_seconds_sum 0.004619405
go_gc_duration_seconds_count 27
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 29
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.10"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 1.581062048e+09
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 2.808293616e+09
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.494326e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 1.1279913e+07
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction -0.00014404354379774563
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 6.0936192e+07
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 1.581062048e+09
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 3.8551552e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 1.590673408e+09
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 1.6924595e+07
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 0
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 1.62922496e+09
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.520291233297057e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 397
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 2.8204508e+07
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 13888
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 1.4781696e+07
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 1.4893056e+07
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 2.38107752e+09
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 4.366786e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 983040
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 983040
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 1.711914744e+09
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 16
```

View File

@ -0,0 +1,27 @@
---
title: Stability and compatibility
menu:
influxdb_1_8:
weight: 90
parent: Administration
---
## 1.x API compatibility and stability
One of the more important aspects of the 1.0 release is that this marks the stabilization of our API and storage format. Over the course of the last three years weve iterated aggressively, often breaking the API in the process. With the release of 1.0 and for the entire 1.x line of releases were committing to the following:
### No breaking InfluxDB API changes
When it comes to the InfluxDB API, if a command works in 1.0 it will work unchanged in all 1.x releases...with one caveat. We will be adding [keywords](/influxdb/v1.8/query_language/spec/#keywords) to the query language. New keywords won't break your queries if you wrap all [identifiers](/influxdb/v1.8/concepts/glossary/#identifier) in double quotes and all string literals in single quotes. This is generally considered best practice so it should be followed anyway. For users following that guideline, the query and ingestion APIs will have no breaking changes for all 1.x releases. Note that this does not include the Go code in the project. The underlying Go API in InfluxDB can and will change over the course of 1.x development. Users should be accessing InfluxDB through the [InfluxDB API](/influxdb/v1.8/tools/api/).
### Storage engine stability
The [TSM](/influxdb/v1.8/concepts/glossary/#tsm-time-structured-merge-tree) storage engine file format is now at version 1. While we may introduce new versions of the format in the 1.x releases, these new versions will run side-by-side with previous versions. What this means for users is there will be no lengthy migrations when upgrading from one 1.x release to another.
### Additive changes
The query engine will have additive changes over the course of the new releases. Well introduce new query functions and new functionality into the language without breaking backwards compatibility. We may introduce new protocol endpoints (like a binary format) and versions of the line protocol and query API to improve performance and/or functionality, but they will have to run in parallel with the existing versions. Existing versions will be supported for the entirety of the 1.x release line.
### Ongoing support
Well continue to fix bugs on the 1.x versions of the [line protocol](/influxdb/v1.8/concepts/glossary/#influxdb-line-protocol), query API, and TSM storage format. Users should expect to upgrade to the latest 1.x.x release for bug fixes, but those releases will all be compatible with the 1.0 API and wont require data migrations. For instance, if a user is running 1.2 and there are bug fixes released in 1.3, they should upgrade to the 1.3 release. Until 1.4 is released, patch fixes will go into 1.3.x. Because all future 1.x releases are drop in replacements for previous 1.x releases, users should upgrade to the latest in the 1.x line to get all bug fixes.

View File

@ -0,0 +1,207 @@
---
title: Manage subscriptions in InfluxDB
description: InfluxDB uses subscriptions to copy all written data to a local or remote endpoint. This article walks through how InfluxDB subscriptions work, how to configure them, and how to manage them.
menu:
influxdb_1_8:
parent: Administration
name: Manage subscriptions
weight: 100
---
InfluxDB subscriptions are local or remote endpoints to which all data written to InfluxDB is copied.
Subscriptions are primarily used with [Kapacitor](/kapacitor/), but any endpoint
able to accept UDP, HTTP, or HTTPS connections can subscribe to InfluxDB and receive
a copy of all data as it is written.
## How subscriptions work
As data is written to InfluxDB, writes are duplicated to subscriber endpoints via
HTTP, HTTPS, or UDP in [line protocol](/influxdb/v1.8/write_protocols/line_protocol_tutorial/).
the InfluxDB subscriber service creates multiple "writers" ([goroutines](https://golangbot.com/goroutines/))
which send writes to the subscription endpoints.
_The number of writer goroutines is defined by the [`write-concurrency`](/influxdb/v1.8/administration/config#write-concurrency-40) configuration._
As writes occur in InfluxDB, each subscription writer sends the written data to the
specified subscription endpoints.
However, with a high `write-concurrency` (multiple writers) and a high ingest rate,
nanosecond differences in writer processes and the transport layer can result
in writes being received out of order.
> #### Important information about high write loads
> While setting the subscriber `write-concurrency` to greater than 1 does increase your
> subscriber write throughput, it can result in out-of-order writes under high ingest rates.
> Setting `write-concurrency` to 1 ensures writes are passed to subscriber endpoints sequentially,
> but can create a bottleneck under high ingest rates.
>
> What `write-concurrency` should be set to depends on your specific workload
> and need for in-order writes to your subscription endpoint.
## InfluxQL subscription statements
Use the following InfluxQL statements to manage subscriptions:
[`CREATE SUBSCRIPTION`](#create-subscriptions)
[`SHOW SUBSCRIPTIONS`](#show-subscriptions)
[`DROP SUBSCRIPTION`](#remove-subscriptions)
## Create subscriptions
Create subscriptions using the `CREATE SUBSCRIPTION` InfluxQL statement.
Specify the subscription name, the database name and retention policy to subscribe to,
and the URL of the host to which data written to InfluxDB should be copied.
```sql
-- Pattern:
CREATE SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>" DESTINATIONS <ALL|ANY> "<subscription_endpoint_host>"
-- Examples:
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to 'example.com:9090' via HTTP.
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://example.com:9090'
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that round-robins the data to 'h1.example.com:9090' and 'h2.example.com:9090' via UDP.
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ANY 'udp://h1.example.com:9090', 'udp://h2.example.com:9090'
```
In case authentication is enabled on the subscriber host, adapt the URL to contain the credentials.
```
-- Create a SUBSCRIPTION on database 'mydb' and retention policy 'autogen' that sends data to another InfluxDB on 'example.com:8086' via HTTP. Authentication is enabled on the subscription host (user: subscriber, pass: secret).
CREATE SUBSCRIPTION "sub0" ON "mydb"."autogen" DESTINATIONS ALL 'http://subscriber:secret@example.com:8086'
```
{{% warn %}}
`SHOW SUBSCRIPTIONS` outputs all subscriber URL in plain text, including those with authentication credentials.
Any user with the privileges to run `SHOW SUBSCRIPTIONS` is able to see these credentials.
{{% /warn %}}
### Sending subscription data to multiple hosts
The `CREATE SUBSCRIPTION` statement allows you to specify multiple hosts as endpoints for the subscription.
In your `DESTINATIONS` clause, you can pass multiple host strings separated by commas.
Using `ALL` or `ANY` in the `DESTINATIONS` clause determines how InfluxDB writes data to each endpoint:
`ALL`: Writes data to all specified hosts.
`ANY`: Round-robins writes between specified hosts.
_**Subscriptions with multiple hosts**_
```sql
-- Write all data to multiple hosts
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ALL 'http://host1.example.com:9090', 'http://host2.example.com:9090'
-- Round-robin writes between multiple hosts
CREATE SUBSCRIPTION "mysub" ON "mydb"."autogen" DESTINATIONS ANY 'http://host1.example.com:9090', 'http://host2.example.com:9090'
```
### Subscription protocols
Subscriptions can use HTTP, HTTPS, or UDP transport protocols.
Which to use is determined by the protocol expected by the subscription endpoint.
If creating a Kapacitor subscription, this is defined by the `subscription-protocol`
option in the `[[influxdb]]` section of your [`kapacitor.conf`](/kapacitor/latest/administration/subscription-management/#subscription-protocol).
_**kapacitor.conf**_
```toml
[[influxdb]]
# ...
subscription-protocol = "http"
# ...
```
_For information regarding HTTPS connections and secure communication between InfluxDB and Kapacitor,
view the [Kapacitor security](/kapacitor/v1.5/administration/security/#secure-influxdb-and-kapacitor) documentation._
## Show subscriptions
The `SHOW SUBSCRIPTIONS` InfluxQL statement returns a list of all subscriptions registered in InfluxDB.
```sql
SHOW SUBSCRIPTIONS
```
_**Example output:**_
```bash
name: _internal
retention_policy name mode destinations
---------------- ---- ---- ------------
monitor kapacitor-39545771-7b64-4692-ab8f-1796c07f3314 ANY [http://localhost:9092]
```
## Remove subscriptions
Remove or drop subscriptions using the `DROP SUBSCRIPTION` InfluxQL statement.
```sql
-- Pattern:
DROP SUBSCRIPTION "<subscription_name>" ON "<db_name>"."<retention_policy>"
-- Example:
DROP SUBSCRIPTION "sub0" ON "mydb"."autogen"
```
### Drop all subscriptions
In some cases, it may be necessary to remove all subscriptions.
Run the following bash script that utilizes the `influx` CLI, loops through all subscriptions, and removes them.
This script depends on the `$INFLUXUSER` and `$INFLUXPASS` environment variables.
If these are not set, export them as part of the script.
```bash
# Environment variable exports:
# Uncomment these if INFLUXUSER and INFLUXPASS are not already globally set.
# export INFLUXUSER=influxdb-username
# export INFLUXPASS=influxdb-password
IFS=$'\n'; for i in $(influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute 'show subscriptions' | tail -n +2 | grep -v name); do influx -format csv -username $INFLUXUSER -password $INFLUXPASS -database _internal -execute "drop subscription \"$(echo "$i" | cut -f 3 -d ',')\" ON \"$(echo "$i" | cut -f 1 -d ',')\".\"$(echo "$i" | cut -f 2 -d ',')\""; done
```
## Configure InfluxDB subscriptions
InfluxDB subscription configuration options are available in the `[subscriber]`
section of the `influxdb.conf`.
In order to use subcriptions, the `enabled` option in the `[subscriber]` section must be set to `true`.
Below is an example `influxdb.conf` subscriber configuration:
```toml
[subscriber]
enabled = true
http-timeout = "30s"
insecure-skip-verify = false
ca-certs = ""
write-concurrency = 40
write-buffer-size = 1000
```
_**Descriptions of `[subscriber]` configuration options are available in the [Configuring InfluxDB](/influxdb/v1.8/administration/config#subscription-settings) documentation.**_
## Troubleshooting
### Inaccessible or decommissioned subscription endpoints
Unless a subscription is [dropped](#remove-subscriptions), InfluxDB assumes the endpoint
should always receive data and will continue to attempt to send data.
If an endpoint host is inaccessible or has been decommissioned, you will see errors
similar to the following:
```bash
# Some message content omitted (...) for the sake of brevity
"Post http://x.y.z.a:9092/write?consistency=...: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" ... service=subscriber
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp x.y.z.a:9092: getsockopt: connection refused" ... service=subscriber
"Post http://x.y.z.a:9092/write?consistency=...: dial tcp 172.31.36.5:9092: getsockopt: no route to host" ... service=subscriber
```
In some cases, this may be caused by a networking error or something similar
preventing a successful connection to the subscription endpoint.
In other cases, it's because the subscription endpoint no longer exists and
the subscription hasn't been dropped from InfluxDB.
> Because InfluxDB does not know if a subscription endpoint will or will not become accessible again,
> subscriptions are not automatically dropped when an endpoint becomes inaccessible.
> If a subscription endpoint is removed, you must manually [drop the subscription](#remove-subscriptions) from InfluxDB.

View File

@ -0,0 +1,62 @@
---
title: Upgrade to InfluxDB 1.8.x
menu:
influxdb_1_8:
name: Upgrade InfluxDB
weight: 25
parent: Administration
---
We recommend enabling Time Series Index (TSI) (step 3 of Upgrade to InfluxDB 1.8.x). [Switch between TSM and TSI](#switch-between-tsm-and-tsi-indexes) as needed. To learn more about TSI, see:
- [Time Series Index (TSI) overview](/influxdb/v1.8/concepts/time-series-index/)
- [Time Series Index (TSI) details](/influxdb/v1.8/concepts/tsi-details/)
> **_Note:_** The default configuration continues to use TSM-based shards with in-memory indexes (as in earlier versions).
{{% note %}}
### Upgrade to InfluxDB Enterprise
To upgrade from InfluxDB OSS to InfluxDB Enterprise, [contact InfluxData Sales](https://www.influxdata.com/contact-sales/)
and see [Migrate to InfluxDB Enterprise](/enterprise_influxdb/latest/guides/migration/).
{{% /note %}}
## Upgrade to InfluxDB 1.8.x
1. [Download](https://portal.influxdata.com/downloads) InfluxDB version 1.8.x and [install the upgrade](/influxdb/v1.8/introduction/installation).
2. Migrate configuration file customizations from your existing configuration file to the InfluxDB 1.8.x [configuration file](/influxdb/v1.8/administration/config/). Add or modify your environment variables as needed.
3. To enable TSI in InfluxDB 1.8.x, complete the following steps:
a. If using the InfluxDB configuration file, find the `[data]` section, uncomment `index-version = "inmem"` and change the value to `tsi1`.
b. If using environment variables, set `INFLUXDB_DATA_INDEX_VERSION` to `tsi1`.
c. Delete shard `index` directories (by default, located at `/<shard_ID>/index`).
d. Build TSI by running the [influx_inspect buildtsi](/influxdb/v1.8/tools/influx_inspect/#buildtsi) command.
> **Note** Run the `buildtsi` command using the user account that you are going to run the database as, or ensure that the permissions match afterward.
4. Restart the `influxdb` service.
## Switch index types
Switch index types at any time by doing one of the following:
- To switch from to `inmem` to `tsi1`, complete steps 3 and 4 above in [Upgrade to InfluxDB 1.8.x](#upgrade-to-influxdb-1.8.x).
- To switch from to `tsi1` to `inmem`, change `tsi1` to `inmem` by completing steps 3a-3c and 4 above in [Upgrade to InfluxDB 1.8.x](#upgrade-to-influxdb-1.8.x).
## Downgrade InfluxDB
To downgrade to an earlier version, complete the procedures above in [Upgrade to InfluxDB 1.8.x](#upgrade-to-influxdb-1-7-x), replacing the version numbers with the version that you want to downgrade to.
After downloading the release, migrating your configuration settings, and enabling TSI or TSM, make sure to [rebuild your index](/influxdb/v1.8/administration/rebuild-tsi-index/#sidebar).
>**Note:** Some versions of InfluxDB may have breaking changes that impact your ability to upgrade and downgrade. For example, you cannot downgrade from InfluxDB 1.3 or later to an earlier version. Please review the applicable version of release notes to check for compatibility issues between releases.
## Upgrade InfluxDB Enterprise clusters
See [Upgrading InfluxDB Enterprise clusters](/enterprise_influxdb/v1.8/administration/upgrading/).

View File

@ -0,0 +1,40 @@
---
title: InfluxDB concepts
menu:
influxdb_1_8:
name: Concepts
weight: 30
---
Understanding the following concepts will help you get the most out of InfluxDB.
## [Key concepts](/influxdb/v1.8/concepts/key_concepts/)
A brief explanation of the InfluxDB core architecture.
## [Glossary of terms](/influxdb/v1.8/concepts/glossary/)
A list of InfluxDB terms and their definitions.
## [Comparison to SQL](/influxdb/v1.8/concepts/crosswalk/)
## [Design insights and tradeoffs](/influxdb/v1.8/concepts/insights_tradeoffs/)
A brief treatment of some of the performance tradeoffs made during the design phase of InfluxDB
## [Schema and data layout](/influxdb/v1.8/concepts/schema_and_data_layout/)
A useful overview of the InfluxDB time series data structure and how it affects performance.
## [TSM storage engine](/influxdb/v1.8/concepts/storage_engine/)
An overview of how InfluxDB to stores data on disk and uses TSM for in-memory indexing.
## [TSI (Time Series Index) overview](/influxdb/v1.8/concepts/time-series-index/)
An overview of how InfluxDB uses TSI (Time Series Index) for disk-based indexing.
## [TSI (Time Series Index) details](/influxdb/v1.8/concepts/tsi-details/)
A detail look at understanding how TSI works, the file structure, and tooling.

View File

@ -0,0 +1,218 @@
---
title: Compare InfluxDB to SQL databases
menu:
influxdb_1_8:
name: Compare InfluxDB to SQL databases
weight: 30
parent: Concepts
---
InfluxDB is similar to a SQL database, but different in many ways.
InfluxDB is purpose-built for time series data.
Relational databases _can_ handle time series data, but are not optimized for common time series workloads.
InfluxDB is designed to store large volumes of time series data and quickly perform real-time analysis on that data.
### Timing is everything
In InfluxDB, a timestamp identifies a single point in any given data series.
This is like an SQL database table where the primary key is pre-set by the system and is always time.
InfluxDB also recognizes that your [schema](/influxdb/v1.8/concepts/glossary/#schema) preferences may change over time.
In InfluxDB you don't have to define schemas up front.
Data points can have one of the fields on a measurement, all of the fields on a measurement, or any number in-between.
You can add new fields to a measurement simply by writing a point for that new field.
If you need an explanation of the terms measurements, tags, and fields check out the next section for an SQL database to InfluxDB terminology crosswalk.
## Terminology
The table below is a (very) simple example of a table called `foodships` in an SQL database
with the unindexed column `#_foodships` and the indexed columns `park_id`, `planet`, and `time`.
``` sql
+---------+---------+---------------------+--------------+
| park_id | planet | time | #_foodships |
+---------+---------+---------------------+--------------+
| 1 | Earth | 1429185600000000000 | 0 |
| 1 | Earth | 1429185601000000000 | 3 |
| 1 | Earth | 1429185602000000000 | 15 |
| 1 | Earth | 1429185603000000000 | 15 |
| 2 | Saturn | 1429185600000000000 | 5 |
| 2 | Saturn | 1429185601000000000 | 9 |
| 2 | Saturn | 1429185602000000000 | 10 |
| 2 | Saturn | 1429185603000000000 | 14 |
| 3 | Jupiter | 1429185600000000000 | 20 |
| 3 | Jupiter | 1429185601000000000 | 21 |
| 3 | Jupiter | 1429185602000000000 | 21 |
| 3 | Jupiter | 1429185603000000000 | 20 |
| 4 | Saturn | 1429185600000000000 | 5 |
| 4 | Saturn | 1429185601000000000 | 5 |
| 4 | Saturn | 1429185602000000000 | 6 |
| 4 | Saturn | 1429185603000000000 | 5 |
+---------+---------+---------------------+--------------+
```
Those same data look like this in InfluxDB:
```sql
name: foodships
tags: park_id=1, planet=Earth
time #_foodships
---- ------------
2015-04-16T12:00:00Z 0
2015-04-16T12:00:01Z 3
2015-04-16T12:00:02Z 15
2015-04-16T12:00:03Z 15
name: foodships
tags: park_id=2, planet=Saturn
time #_foodships
---- ------------
2015-04-16T12:00:00Z 5
2015-04-16T12:00:01Z 9
2015-04-16T12:00:02Z 10
2015-04-16T12:00:03Z 14
name: foodships
tags: park_id=3, planet=Jupiter
time #_foodships
---- ------------
2015-04-16T12:00:00Z 20
2015-04-16T12:00:01Z 21
2015-04-16T12:00:02Z 21
2015-04-16T12:00:03Z 20
name: foodships
tags: park_id=4, planet=Saturn
time #_foodships
---- ------------
2015-04-16T12:00:00Z 5
2015-04-16T12:00:01Z 5
2015-04-16T12:00:02Z 6
2015-04-16T12:00:03Z 5
```
Referencing the example above, in general:
* An InfluxDB measurement (`foodships`) is similar to an SQL database table.
* InfluxDB tags ( `park_id` and `planet`) are like indexed columns in an SQL database.
* InfluxDB fields (`#_foodships`) are like unindexed columns in an SQL database.
* InfluxDB points (for example, `2015-04-16T12:00:00Z 5`) are similar to SQL rows.
Building on this comparison of database terminology,
InfluxDB [continuous queries](/influxdb/v1.8/concepts/glossary/#continuous-query-cq)
and [retention policies](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) are
similar to stored procedures in an SQL database.
They're specified once and then performed regularly and automatically.
Of course, there are some major disparities between SQL databases and InfluxDB.
SQL `JOIN`s aren't available for InfluxDB measurements; your schema design should reflect that difference.
And, as we mentioned above, a measurement is like an SQL table where the primary index is always pre-set to time.
InfluxDB timestamps must be in UNIX epoch (GMT) or formatted as a date-time string valid under RFC3339.
For more detailed descriptions of the InfluxDB terms mentioned in this section see our [Glossary of Terms](/influxdb/v1.8/concepts/glossary/).
## Query languages
InfluxDB supports multiple query languages:
- [Flux](#flux)
- [InfluxQL](#influxql)
### Flux
[Flux](/flux/latest/introduction) is a data scripting language designed for querying, analyzing, and acting on time series data.
Beginning with **InfluxDB 1.8.0**, Flux is available for production use along side InfluxQL.
For those familiar with [InfluxQL](#influxql-and-sql), Flux is intended to address
many of the outstanding feature requests that we've received since introducing InfluxDB 1.0.
For a comparison between Flux and InfluxQL, see [Flux vs InfluxQL](/flux/latest/introduction/flux-vs-influxql).
Flux is the primary language for working with data in [InfluxDB 2.0 OSS](https://v2.docs.influxdata.com/v2.0/get-started)
(currently in _beta_) and [InfluxDB Cloud 2.0](https://v2.docs.influxdata.com/v2.0/cloud/get-started/),
a generally available Platform as a Service (PaaS) available across multiple Cloud Service Providers.
Using Flux with InfluxDB 1.8+ lets you get familiar with Flux concepts and syntax
and ease the transition to InfluxDB 2.0.
### InfluxQL
InfluxQL is an SQL-like query language for interacting with InfluxDB.
It has been crafted to feel familiar to those coming from other
SQL or SQL-like environments while also providing features specific
to storing and analyzing time series data.
However **InfluxQL is not SQL** and lacks support for more advanced operations
like `UNION`, `JOIN` and `HAVING` that SQL power-users are accustomed to.
This functionality is available with [Flux](/flux/latest/introduction).
InfluxQL's `SELECT` statement follows the form of an SQL `SELECT` statement:
```sql
SELECT <stuff> FROM <measurement_name> WHERE <some_conditions>
```
where `WHERE` is optional.
To get the InfluxDB output in the section above, you'd enter:
```sql
SELECT * FROM "foodships"
```
If you only wanted to see data for the planet `Saturn`, you'd enter:
```sql
SELECT * FROM "foodships" WHERE "planet" = 'Saturn'
```
If you wanted to see data for the planet `Saturn` after 12:00:01 UTC on April 16, 2015, you'd enter:
```sql
SELECT * FROM "foodships" WHERE "planet" = 'Saturn' AND time > '2015-04-16 12:00:01'
```
As shown in the example above, InfluxQL allows you to specify the time range of your query in the `WHERE` clause.
You can use date-time strings wrapped in single quotes that have the
format `YYYY-MM-DD HH:MM:SS.mmm`
(`mmm` is milliseconds and is optional, and you can also specify microseconds or nanoseconds).
You can also use relative time with `now()` which refers to the server's current timestamp:
```sql
SELECT * FROM "foodships" WHERE time > now() - 1h
```
That query outputs the data in the `foodships` measure where the timestamp is newer than the server's current time minus one hour.
The options for specifying time durations with `now()` are:
|Letter|Meaning|
|:---:|:---:|
| ns | nanoseconds |
|u or µ|microseconds|
| ms | milliseconds |
|s | seconds |
| m | minutes |
| h | hours |
| d | days |
| w | weeks |
InfluxQL also supports regular expressions, arithmetic in expressions, `SHOW` statements, and `GROUP BY` statements.
See our [data exploration](/influxdb/v1.8/query_language/data_exploration/) page for an in-depth discussion of those topics.
InfluxQL functions include `COUNT`, `MIN`, `MAX`, `MEDIAN`, `DERIVATIVE` and more.
For a full list check out the [functions](/influxdb/v1.8/query_language/functions/) page.
Now that you have the general idea, check out our [Getting Started Guide](/influxdb/v1.8/introduction/getting-started/).
## InfluxDB is not CRUD
InfluxDB is a database that has been optimized for time series data.
This data commonly comes from sources like distributed sensor groups, click data from large websites, or lists of financial transactions.
One thing this data has in common is that it is more useful in the aggregate.
One reading saying that your computers CPU is at 12% utilization at 12:38:35 UTC on a Tuesday is hard to draw conclusions from.
It becomes more useful when combined with the rest of the series and visualized.
This is where trends over time begin to show, and actionable insight can be drawn from the data.
In addition, time series data is generally written once and rarely updated.
The result is that InfluxDB is not a full CRUD database but more like a CR-ud, prioritizing the performance of creating and reading data over update and destroy, and [preventing some update and destroy behaviors](/influxdb/v1.8/concepts/insights_tradeoffs/) to make create and read more performant:
* To update a point, insert one with [the same measurement, tag set, and timestamp](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points).
* You can [drop or delete a series](/influxdb/v1.8/query_language/database_management/#drop-series-from-the-index-with-drop-series), but not individual points based on field values. As a workaround, you can search for the field value, retrieve the time, then [DELETE based on the `time` field](/influxdb/v1.8/query_language/database_management/#delete-series-with-delete).
* You can't update or rename tags yet - see GitHub issue [#4157](https://github.com/influxdata/influxdb/issues/4157) for more information. To modify the tag of a series of points, find the points with the offending tag value, change the value to the desired one, write the points back, then drop the series with the old tag value.
* You can't delete tags by tag key (as opposed to value) - see GitHub issue [#8604](https://github.com/influxdata/influxdb/issues/8604).

View File

@ -0,0 +1,386 @@
---
title: InfluxDB glossary
menu:
influxdb_1_8:
name: Glossary
weight: 20
parent: Concepts
---
## aggregation
An InfluxQL function that returns an aggregated value across a set of points.
For a complete list of the available and upcoming aggregations, see [InfluxQL functions](/influxdb/v1.8/query_language/functions/#aggregations).
Related entries: [function](/influxdb/v1.8/concepts/glossary/#function), [selector](/influxdb/v1.8/concepts/glossary/#selector), [transformation](/influxdb/v1.8/concepts/glossary/#transformation)
## batch
A collection of data points in InfluxDB line protocol format, separated by newlines (`0x0A`).
A batch of points may be submitted to the database using a single HTTP request to the write endpoint.
This makes writes using the InfluxDB API much more performant by drastically reducing the HTTP overhead.
InfluxData recommends batch sizes of 5,000-10,000 points, although different use cases may be better served by significantly smaller or larger batches.
Related entries: [InfluxDB line protocol](/influxdb/v1.8/concepts/glossary/#influxdb-line-protocol), [point](/influxdb/v1.8/concepts/glossary/#point)
## bucket
A bucket is a named location where time series data is stored in **InfluxDB 2.0**. In InfluxDB 1.8+, each combination of a database and a retention policy (database/retention-policy) represents a bucket. Use the [InfluxDB 2.0 API compatibility endpoints](/influxdb/v1.8/tools/api#influxdb-2-0-api-compatibility-endpoints) included with InfluxDB 1.8+ to interact with buckets.
## continuous query (CQ)
An InfluxQL query that runs automatically and periodically within a database.
Continuous queries require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
See [Continuous Queries](/influxdb/v1.8/query_language/continuous_queries/).
Related entries: [function](/influxdb/v1.8/concepts/glossary/#function)
## database
A logical container for users, retention policies, continuous queries, and time series data.
Related entries: [continuous query](/influxdb/v1.8/concepts/glossary/#continuous-query-cq), [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp), [user](/influxdb/v1.8/concepts/glossary/#user)
## duration
The attribute of the retention policy that determines how long InfluxDB stores data.
Data older than the duration are automatically dropped from the database.
See [Database Management](/influxdb/v1.8/query_language/database_management/#create-retention-policies-with-create-retention-policy) for how to set duration.
Related entries: [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp)
## field
The key-value pair in an InfluxDB data structure that records metadata and the actual data value.
Fields are required in InfluxDB data structures and they are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant relative to tags.
*Query tip:* Compare fields to tags; tags are indexed.
Related entries: [field key](/influxdb/v1.8/concepts/glossary/#field-key), [field set](/influxdb/v1.8/concepts/glossary/#field-set), [field value](/influxdb/v1.8/concepts/glossary/#field-value), [tag](/influxdb/v1.8/concepts/glossary/#tag)
## field key
The key part of the key-value pair that makes up a field.
Field keys are strings and they store metadata.
Related entries: [field](/influxdb/v1.8/concepts/glossary/#field), [field set](/influxdb/v1.8/concepts/glossary/#field-set), [field value](/influxdb/v1.8/concepts/glossary/#field-value), [tag key](/influxdb/v1.8/concepts/glossary/#tag-key)
## field set
The collection of field keys and field values on a point.
Related entries: [field](/influxdb/v1.8/concepts/glossary/#field), [field key](/influxdb/v1.8/concepts/glossary/#field-key), [field value](/influxdb/v1.8/concepts/glossary/#field-value), [point](/influxdb/v1.8/concepts/glossary/#point)
## field value
The value part of the key-value pair that makes up a field.
Field values are the actual data; they can be strings, floats, integers, or booleans.
A field value is always associated with a timestamp.
Field values are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant.
*Query tip:* Compare field values to tag values; tag values are indexed.
Related entries: [field](/influxdb/v1.8/concepts/glossary/#field), [field key](/influxdb/v1.8/concepts/glossary/#field-key), [field set](/influxdb/v1.8/concepts/glossary/#field-set), [tag value](/influxdb/v1.8/concepts/glossary/#tag-value), [timestamp](/influxdb/v1.8/concepts/glossary/#timestamp)
## function
InfluxQL aggregations, selectors, and transformations.
See [InfluxQL Functions](/influxdb/v1.8/query_language/functions/) for a complete list of InfluxQL functions.
Related entries: [aggregation](/influxdb/v1.8/concepts/glossary/#aggregation), [selector](/influxdb/v1.8/concepts/glossary/#selector), [transformation](/influxdb/v1.8/concepts/glossary/#transformation)
## identifier
Tokens that refer to continuous query names, database names, field keys,
measurement names, retention policy names, subscription names, tag keys, and
user names.
See [Query Language Specification](/influxdb/v1.8/query_language/spec/#identifiers).
Related entries:
[database](/influxdb/v1.8/concepts/glossary/#database),
[field key](/influxdb/v1.8/concepts/glossary/#field-key),
[measurement](/influxdb/v1.8/concepts/glossary/#measurement),
[retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp),
[tag key](/influxdb/v1.8/concepts/glossary/#tag-key),
[user](/influxdb/v1.8/concepts/glossary/#user)
## InfluxDB line protocol
The text based format for writing points to InfluxDB. See [InfluxDB line protocol](/influxdb/v1.8/write_protocols/).
## measurement
The part of the InfluxDB data structure that describes the data stored in the associated fields.
Measurements are strings.
Related entries: [field](/influxdb/v1.8/concepts/glossary/#field), [series](/influxdb/v1.8/concepts/glossary/#series)
## metastore
Contains internal information about the status of the system.
The metastore contains the user information, databases, retention policies, shard metadata, continuous queries, and subscriptions.
Related entries: [database](/influxdb/v1.8/concepts/glossary/#database), [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp), [user](/influxdb/v1.8/concepts/glossary/#user)
## node
An independent `influxd` process.
Related entries: [server](/influxdb/v1.8/concepts/glossary/#server)
## now()
The local server's nanosecond timestamp.
## point
In InfluxDB, a point represents a single data record, similar to a row in a SQL database table. Each point:
- has a measurement, a tag set, a field key, a field value, and a timestamp;
- is uniquely identified by its series and timestamp.
You cannot store more than one point with the same timestamp in a series.
If you write a point to a series with a timestamp that matches an existing point, the field set becomes a union of the old and new field set, and any ties go to the new field set.
For more information about duplicate points, see [How does InfluxDB handle duplicate points?](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)
Related entries: [field set](/influxdb/v1.8/concepts/glossary/#field-set), [series](/influxdb/v1.8/concepts/glossary/#series), [timestamp](/influxdb/v1.8/concepts/glossary/#timestamp)
## points per second
A deprecated measurement of the rate at which data are persisted to InfluxDB.
The schema allows and even encourages the recording of multiple metric values per point, rendering points per second ambiguous.
Write speeds are generally quoted in values per second, a more precise metric.
Related entries: [point](/influxdb/v1.8/concepts/glossary/#point), [schema](/influxdb/v1.8/concepts/glossary/#schema), [values per second](/influxdb/v1.8/concepts/glossary/#values-per-second)
## query
An operation that retrieves data from InfluxDB.
See [Data Exploration](/influxdb/v1.8/query_language/data_exploration/), [Schema Exploration](/influxdb/v1.8/query_language/schema_exploration/), [Database Management](/influxdb/v1.8/query_language/database_management/).
## replication factor
The attribute of the retention policy that determines how many copies of data to concurrently store (or retain) in the cluster. Replicating copies ensures that data is available when a data node (or more) is unavailable.
For three nodes or less, the default replication factor equals the number of data nodes.
For more than three nodes, the default replication factor is 3. To change the default replication factor, specify the replication factor `n` in the retention policy.
Related entries: [cluster](/influxdb/v0.10/concepts/glossary/#cluster), [duration](/influxdb/v1.8/concepts/glossary/#duration), [node](/influxdb/v1.8/concepts/glossary/#node),
[retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp)
## retention policy (RP)
Describes how long InfluxDB keeps data (duration), how many copies of the data to store in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.
When you create a database, InfluxDB creates a retention policy called `autogen` with an infinite duration, a replication factor set to one, and a shard group duration set to seven days.
For more information, see [Retention policy management](/influxdb/v1.8/query_language/database_management/#retention-policy-management).
Related entries: [duration](/influxdb/v1.8/concepts/glossary/#duration), [measurement](/influxdb/v1.8/concepts/glossary/#measurement), [replication factor](/influxdb/v1.8/concepts/glossary/#replication-factor), [series](/influxdb/v1.8/concepts/glossary/#series), [shard duration](/influxdb/v1.8/concepts/glossary/#shard-duration), [tag set](/influxdb/v1.8/concepts/glossary/#tag-set)
## schema
How the data are organized in InfluxDB.
The fundamentals of the InfluxDB schema are databases, retention policies, series, measurements, tag keys, tag values, and field keys.
See [Schema Design](/influxdb/v1.8/concepts/schema_and_data_layout/) for more information.
Related entries: [database](/influxdb/v1.8/concepts/glossary/#database), [field key](/influxdb/v1.8/concepts/glossary/#field-key), [measurement](/influxdb/v1.8/concepts/glossary/#measurement), [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp), [series](/influxdb/v1.8/concepts/glossary/#series), [tag key](/influxdb/v1.8/concepts/glossary/#tag-key), [tag value](/influxdb/v1.8/concepts/glossary/#tag-value)
## selector
An InfluxQL function that returns a single point from the range of specified points.
See [InfluxQL Functions](/influxdb/v1.8/query_language/functions/#selectors) for a complete list of the available and upcoming selectors.
Related entries: [aggregation](/influxdb/v1.8/concepts/glossary/#aggregation), [function](/influxdb/v1.8/concepts/glossary/#function), [transformation](/influxdb/v1.8/concepts/glossary/#transformation)
## series
A logical grouping of data defined by shared measurement, tag set, and field key.
Related entries: [field set](/influxdb/v1.8/concepts/glossary/#field-set), [measurement](/influxdb/v1.8/concepts/glossary/#measurement), [tag set](/influxdb/v1.8/concepts/glossary/#tag-set)
## series cardinality
The number of unique database, measurement, tag set, and field key combinations in an InfluxDB instance.
For example, assume that an InfluxDB instance has a single database and one measurement.
The single measurement has two tag keys: `email` and `status`.
If there are three different `email`s, and each email address is associated with two
different `status`es then the series cardinality for the measurement is 6
(3 * 2 = 6):
| email | status |
| :-------------------- | :----- |
| lorr@influxdata.com | start |
| lorr@influxdata.com | finish |
| marv@influxdata.com | start |
| marv@influxdata.com | finish |
| cliff@influxdata.com | start |
| cliff@influxdata.com | finish |
Note that, in some cases, simply performing that multiplication may overestimate series cardinality because of the presence of dependent tags.
Dependent tags are tags that are scoped by another tag and do not increase series
cardinality.
If we add the tag `firstname` to the example above, the series cardinality
would not be 18 (3 * 2 * 3 = 18).
It would remain unchanged at 6, as `firstname` is already scoped by the `email` tag:
| email | status | firstname |
| :-------------------- | :----- | :-------- |
| lorr@influxdata.com | start | lorraine |
| lorr@influxdata.com | finish | lorraine |
| marv@influxdata.com | start | marvin |
| marv@influxdata.com | finish | marvin |
| cliff@influxdata.com | start | clifford |
| cliff@influxdata.com | finish | clifford |
See [SHOW CARDINALITY](/influxdb/v1.8/query_language/spec/#show-cardinality) to learn about the InfluxQL commands for series cardinality.
Related entries: [field key](#field-key),[measurement](#measurement), [tag key](#tag-key), [tag set](#tag-set)
## series key
A series key identifies a particular series by measurement, tag set, and field key.
For example:
```
# measurement, tag set, field key
h2o_level, location=santa_monica, h2o_feet
```
Related entries: [series](/influxdb/v1.8/concepts/glossary/#series)
## server
A machine, virtual or physical, that is running InfluxDB.
There should only be one InfluxDB process per server.
Related entries: [node](/influxdb/v1.8/concepts/glossary/#node)
## shard
A shard contains the actual encoded and compressed data, and is represented by a TSM file on disk.
Every shard belongs to one and only one shard group.
Multiple shards may exist in a single shard group.
Each shard contains a specific set of series.
All points falling on a given series in a given shard group will be stored in the same shard (TSM file) on disk.
Related entries: [series](/influxdb/v1.8/concepts/glossary/#series), [shard duration](/influxdb/v1.8/concepts/glossary/#shard-duration), [shard group](/influxdb/v1.8/concepts/glossary/#shard-group), [tsm](/influxdb/v1.8/concepts/glossary/#tsm-time-structured-merge-tree)
## shard duration
The shard duration determines how much time each shard group spans.
The specific interval is determined by the `SHARD DURATION` of the retention policy.
See [Retention Policy management](/influxdb/v1.8/query_language/database_management/#retention-policy-management) for more information.
For example, given a retention policy with `SHARD DURATION` set to `1w`, each shard group will span a single week and contain all points with timestamps in that week.
Related entries: [database](/influxdb/v1.8/concepts/glossary/#database), [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp), [series](/influxdb/v1.8/concepts/glossary/#series), [shard](/influxdb/v1.8/concepts/glossary/#shard), [shard group](/influxdb/v1.8/concepts/glossary/#shard-group)
## shard group
Shard groups are logical containers for shards.
Shard groups are organized by time and retention policy.
Every retention policy that contains data has at least one associated shard group.
A given shard group contains all shards with data for the interval covered by the shard group.
The interval spanned by each shard group is the shard duration.
Related entries: [database](/influxdb/v1.8/concepts/glossary/#database), [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp), [series](/influxdb/v1.8/concepts/glossary/#series), [shard](/influxdb/v1.8/concepts/glossary/#shard), [shard duration](/influxdb/v1.8/concepts/glossary/#shard-duration)
## subscription
Subscriptions allow [Kapacitor](/kapacitor/latest/) to receive data from InfluxDB in a push model rather than the pull model based on querying data.
When Kapacitor is configured to work with InfluxDB, the subscription will automatically push every write for the subscribed database from InfluxDB to Kapacitor.
Subscriptions can use TCP or UDP for transmitting the writes.
## tag
The key-value pair in the InfluxDB data structure that records metadata.
Tags are an optional part of the data structure, but they are useful for storing commonly-queried metadata; tags are indexed so queries on tags are performant.
*Query tip:* Compare tags to fields; fields are not indexed.
Related entries: [field](/influxdb/v1.8/concepts/glossary/#field), [tag key](/influxdb/v1.8/concepts/glossary/#tag-key), [tag set](/influxdb/v1.8/concepts/glossary/#tag-set), [tag value](/influxdb/v1.8/concepts/glossary/#tag-value)
## tag key
The key part of the key-value pair that makes up a tag.
Tag keys are strings and they store metadata.
Tag keys are indexed so queries on tag keys are performant.
*Query tip:* Compare tag keys to field keys; field keys are not indexed.
Related entries: [field key](/influxdb/v1.8/concepts/glossary/#field-key), [tag](/influxdb/v1.8/concepts/glossary/#tag), [tag set](/influxdb/v1.8/concepts/glossary/#tag-set), [tag value](/influxdb/v1.8/concepts/glossary/#tag-value)
## tag set
The collection of tag keys and tag values on a point.
Related entries: [point](/influxdb/v1.8/concepts/glossary/#point), [series](/influxdb/v1.8/concepts/glossary/#series), [tag](/influxdb/v1.8/concepts/glossary/#tag), [tag key](/influxdb/v1.8/concepts/glossary/#tag-key), [tag value](/influxdb/v1.8/concepts/glossary/#tag-value)
## tag value
The value part of the key-value pair that makes up a tag.
Tag values are strings and they store metadata.
Tag values are indexed so queries on tag values are performant.
Related entries: [tag](/influxdb/v1.8/concepts/glossary/#tag), [tag key](/influxdb/v1.8/concepts/glossary/#tag-key), [tag set](/influxdb/v1.8/concepts/glossary/#tag-set)
## timestamp
The date and time associated with a point.
All time in InfluxDB is UTC.
For how to specify time when writing data, see [Write Syntax](/influxdb/v1.8/write_protocols/write_syntax/).
For how to specify time when querying data, see [Data Exploration](/influxdb/v1.8/query_language/data_exploration/#time-syntax).
Related entries: [point](/influxdb/v1.8/concepts/glossary/#point)
## transformation
An InfluxQL function that returns a value or a set of values calculated from specified points, but does not return an aggregated value across those points.
See [InfluxQL Functions](/influxdb/v1.8/query_language/functions/#transformations) for a complete list of the available and upcoming aggregations.
Related entries: [aggregation](/influxdb/v1.8/concepts/glossary/#aggregation), [function](/influxdb/v1.8/concepts/glossary/#function), [selector](/influxdb/v1.8/concepts/glossary/#selector)
## TSM (Time Structured Merge tree)
The purpose-built data storage format for InfluxDB. TSM allows for greater compaction and higher write and read throughput than existing B+ or LSM tree implementations. See [Storage Engine](http://docs.influxdata.com/influxdb/v1.8/concepts/storage_engine/) for more.
## user
There are two kinds of users in InfluxDB:
* *Admin users* have `READ` and `WRITE` access to all databases and full access to administrative queries and user management commands.
* *Non-admin users* have `READ`, `WRITE`, or `ALL` (both `READ` and `WRITE`) access per database.
When authentication is enabled, InfluxDB only executes HTTP requests that are sent with a valid username and password.
See [Authentication and Authorization](/influxdb/v1.8/administration/authentication_and_authorization/).
## values per second
The preferred measurement of the rate at which data are persisted to InfluxDB. Write speeds are generally quoted in values per second.
To calculate the values per second rate, multiply the number of points written per second by the number of values stored per point. For example, if the points have four fields each, and a batch of 5000 points is written 10 times per second, then the values per second rate is `4 field values per point * 5000 points per batch * 10 batches per second = 200,000 values per second`.
Related entries: [batch](/influxdb/v1.8/concepts/glossary/#batch), [field](/influxdb/v1.8/concepts/glossary/#field), [point](/influxdb/v1.8/concepts/glossary/#point), [points per second](/influxdb/v1.8/concepts/glossary/#points-per-second)
## WAL (Write Ahead Log)
The temporary cache for recently written points. To reduce the frequency with which the permanent storage files are accessed, InfluxDB caches new points in the WAL until their total size or age triggers a flush to more permanent storage. This allows for efficient batching of the writes into the TSM.
Points in the WAL can be queried, and they persist through a system reboot. On process start, all points in the WAL must be flushed before the system accepts new writes.
Related entries: [tsm](/influxdb/v1.8/concepts/glossary/#tsm-time-structured-merge-tree)
<!--
## shard
## shard group
-->

View File

@ -0,0 +1,57 @@
---
title: InfluxDB design insights and tradeoffs
menu:
influxdb_1_8:
name: InfluxDB design insights and tradeoffs
weight: 40
parent: Concepts
---
InfluxDB is a time series database.
Optimizing for this use case entails some tradeoffs, primarily to increase performance at the cost of functionality.
Below is a list of some of those design insights that lead to tradeoffs:
1. For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times.
_**Pro:**_ Simplified [conflict resolution](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points) increases write performance.
_**Con:**_ Cannot store duplicate data; may overwrite data in rare circumstances.
2. Deletes are a rare occurrence.
When they do occur it is almost always against large ranges of old data that are cold for writes.
_**Pro:**_ Restricting access to deletes allows for increased query and write performance.
_**Con:**_ Delete functionality is significantly restricted.
3. Updates to existing data are a rare occurrence and contentious updates never happen.
Time series data is predominantly new data that is never updated.
_**Pro:**_ Restricting access to updates allows for increased query and write performance.
_**Con:**_ Update functionality is significantly restricted.
4. The vast majority of writes are for data with very recent timestamps and the data is added in time ascending order.
_**Pro:**_ Adding data in time ascending order is significantly more performant.
_**Con:**_ Writing points with random times or with time not in ascending order is significantly less performant.
5. Scale is critical.
The database must be able to handle a *high* volume of reads and writes.
_**Pro:**_ The database can handle a *high* volume of reads and writes.
_**Con:**_ The InfluxDB development team was forced to make tradeoffs to increase performance.
6. Being able to write and query the data is more important than having a strongly consistent view.
_**Pro:**_ Writing and querying the database can be done by multiple clients and at high loads.
_**Con:**_ Query returns may not include the most recent points if database is under heavy load.
7. Many time [series](/influxdb/v1.8/concepts/glossary/#series) are ephemeral.
There are often time series that appear only for a few hours and then go away, e.g.
a new host that gets started and reports for a while and then gets shut down.
_**Pro:**_ InfluxDB is good at managing discontinuous data.
_**Con:**_ Schema-less design means that some database functions are not supported e.g. there are no cross table joins.
8. No one point is too important.
_**Pro:**_ InfluxDB has very powerful tools to deal with aggregate data and large data sets.
_**Con:**_ Points don't have IDs in the traditional sense, they are differentiated by timestamp and series.

View File

@ -0,0 +1,213 @@
---
title: InfluxDB key concepts
description: Covers key concepts to learn about InfluxDB.
menu:
influxdb_1_8:
name: Key concepts
weight: 10
parent: Concepts
---
Before diving into InfluxDB it's good to get acquainted with some of the key concepts of the database.
This document provides a gentle introduction to those concepts and common InfluxDB terminology.
We've provided a list below of all the terms we'll cover, but we recommend reading this document from start to finish to gain a more general understanding of our favorite time series database.
<table style="width:100%">
<tr>
<td><a href="#database">database</a></td>
<td><a href="#field-key">field key</a></td>
<td><a href="#field-set">field set</a></td>
</tr>
<tr>
<td><a href="#field-value">field value</a></td>
<td><a href="#measurement">measurement</a></td>
<td><a href="#point">point</a></td>
</tr>
<tr>
<td><a href="#retention-policy">retention policy</a></td>
<td><a href="#series">series</a></td>
<td><a href="#tag-key">tag key</a></td>
</tr>
<tr>
<td><a href="#tag-set">tag set</a></td>
<td><a href="#tag-value">tag value</a></td>
<td><a href="#timestamp">timestamp</a></td>
</tr>
</table>
Check out the [glossary](/influxdb/v1.8/concepts/glossary/) if you prefer the cold, hard facts.
### Sample data
The next section references the data printed out below.
The data is fictional, but represents a believable setup in InfluxDB.
They show the number of butterflies and honeybees counted by two scientists (`langstroth` and `perpetua`) in two locations (location `1` and location `2`) over the time period from August 18, 2015 at midnight through August 18, 2015 at 6:12 AM.
Assume that the data lives in a database called `my_database` and are subject to the `autogen` retention policy (more on databases and retention policies to come).
*Hint:* Hover over the links for tooltips to get acquainted with InfluxDB terminology and the layout.
**name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
| time | <span class ="tooltip" data-tooltip-text ="Field key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Field key">honeybees</span> | <span class ="tooltip" data-tooltip-text ="Tag key">location</span> | <span class ="tooltip" data-tooltip-text ="Tag key">scientist</span> |
| ---- | ------------------------------------------------------------------------ | ---------------------------------------------------------------------- | ------------------------------------------------------------------- | -------------------------------------------------------------------- |
| 2015-08-18T00:00:00Z | 12 | 23 | 1 | langstroth |
| 2015-08-18T00:00:00Z | 1 | 30 | 1 | perpetua |
| 2015-08-18T00:06:00Z | 11 | 28 | 1 | langstroth |
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">3</span> | <span class ="tooltip" data-tooltip-text ="Field value">28</span> | <span class ="tooltip" data-tooltip-text ="Tag value">1</span> | <span class ="tooltip" data-tooltip-text ="Tag value">perpetua</span> |
| 2015-08-18T05:54:00Z | 2 | 11 | 2 | langstroth |
| 2015-08-18T06:00:00Z | 1 | 10 | 2 | langstroth |
| 2015-08-18T06:06:00Z | 8 | 23 | 2 | perpetua |
| 2015-08-18T06:12:00Z | 7 | 22 | 2 | perpetua |
### Discussion
Now that you've seen some sample data in InfluxDB this section covers what it all means.
InfluxDB is a time series database so it makes sense to start with what is at the root of everything we do: time.
In the data above there's a column called `time` - all data in InfluxDB have that column.
`time` stores timestamps, and the <a name="timestamp"></a>_**timestamp**_ shows the date and time, in [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) UTC, associated with particular data.
The next two columns, called `butterflies` and `honeybees`, are fields.
Fields are made up of field keys and field values.
<a name="field-key"></a>_**Field keys**_ (`butterflies` and `honeybees`) are strings; the field key `butterflies` tells us that the field values `12`-`7` refer to butterflies and the field key `honeybees` tells us that the field values `23`-`22` refer to, well, honeybees.
<a name="field-value"></a>_**Field values**_ are your data; they can be strings, floats, integers, or Booleans, and, because InfluxDB is a time series database, a field value is always associated with a timestamp.
The field values in the sample data are:
```
12 23
1 30
11 28
3 28
2 11
1 10
8 23
7 22
```
In the data above, the collection of field-key and field-value pairs make up a <a name="field-set"></a>_**field set**_.
Here are all eight field sets in the sample data:
* `butterflies = 12 honeybees = 23`
* `butterflies = 1 honeybees = 30`
* `butterflies = 11 honeybees = 28`
* `butterflies = 3 honeybees = 28`
* `butterflies = 2 honeybees = 11`
* `butterflies = 1 honeybees = 10`
* `butterflies = 8 honeybees = 23`
* `butterflies = 7 honeybees = 22`
Fields are a required piece of the InfluxDB data structure - you cannot have data in InfluxDB without fields.
It's also important to note that fields are not indexed.
[Queries](/influxdb/v1.8/concepts/glossary/#query) that use field values as filters must scan all values that match the other conditions in the query.
As a result, those queries are not performant relative to queries on tags (more on tags below).
In general, fields should not contain commonly-queried metadata.
The last two columns in the sample data, called `location` and `scientist`, are tags.
Tags are made up of tag keys and tag values.
Both <a name="tag-key"></a>_**tag keys**_ and <a name="tag-value"></a>_**tag values**_ are stored as strings and record metadata.
The tag keys in the sample data are `location` and `scientist`.
The tag key `location` has two tag values: `1` and `2`.
The tag key `scientist` also has two tag values: `langstroth` and `perpetua`.
In the data above, the <a name="tag-set"></a>_**tag set**_ is the different combinations of all the tag key-value pairs.
The four tag sets in the sample data are:
* `location = 1`, `scientist = langstroth`
* `location = 2`, `scientist = langstroth`
* `location = 1`, `scientist = perpetua`
* `location = 2`, `scientist = perpetua`
Tags are optional.
You don't need to have tags in your data structure, but it's generally a good idea to make use of them because, unlike fields, tags are indexed.
This means that queries on tags are faster and that tags are ideal for storing commonly-queried metadata.
Avoid using the following reserved keys:
* `_field`
* `_measurement`
* `time`
If reserved keys are included as a tag or field key, the associated point is discarded.
> **Why indexing matters: The schema case study**
> Say you notice that most of your queries focus on the values of the field keys `honeybees` and `butterflies`:
> `SELECT * FROM "census" WHERE "butterflies" = 1`
> `SELECT * FROM "census" WHERE "honeybees" = 23`
> Because fields aren't indexed, InfluxDB scans every value of `butterflies` in the first query and every value of `honeybees` in the second query before it provides a response.
That behavior can hurt query response times - especially on a much larger scale.
To optimize your queries, it may be beneficial to rearrange your [schema](/influxdb/v1.8/concepts/glossary/#schema) such that the fields (`butterflies` and `honeybees`) become the tags and the tags (`location` and `scientist`) become the fields:
> **name:** <span class="tooltip" data-tooltip-text="Measurement">census</span>
>
| time | <span class ="tooltip" data-tooltip-text ="Field key">location</span> | <span class ="tooltip" data-tooltip-text ="Field key">scientist</span> | <span class ="tooltip" data-tooltip-text ="Tag key">butterflies</span> | <span class ="tooltip" data-tooltip-text ="Tag key">honeybees</span> |
| ---- | --------------------------------------------------------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------- |
| 2015-08-18T00:00:00Z | 1 | langstroth | 12 | 23 |
| 2015-08-18T00:00:00Z | 1 | perpetua | 1 | 30 |
| 2015-08-18T00:06:00Z | 1 | langstroth | 11 | 28 |
| <span class="tooltip" data-tooltip-text="Timestamp">2015-08-18T00:06:00Z</span> | <span class ="tooltip" data-tooltip-text ="Field value">1</span> | <span class ="tooltip" data-tooltip-text ="Field value">perpetua</span> | <span class ="tooltip" data-tooltip-text ="Tag value">3</span> | <span class ="tooltip" data-tooltip-text ="Tag value">28</span> |
| 2015-08-18T05:54:00Z | 2 | langstroth | 2 | 11 |
| 2015-08-18T06:00:00Z | 2 | langstroth | 1 | 10 |
| 2015-08-18T06:06:00Z | 2 | perpetua | 8 | 23 |
| 2015-08-18T06:12:00Z | 2 | perpetua | 7 | 22 |
> Now that `butterflies` and `honeybees` are tags, InfluxDB won't have to scan every one of their values when it performs the queries above - this means that your queries are even faster.
The <a name=measurement></a>_**measurement**_ acts as a container for tags, fields, and the `time` column, and the measurement name is the description of the data that are stored in the associated fields.
Measurement names are strings, and, for any SQL users out there, a measurement is conceptually similar to a table.
The only measurement in the sample data is `census`.
The name `census` tells us that the field values record the number of `butterflies` and `honeybees` - not their size, direction, or some sort of happiness index.
A single measurement can belong to different retention policies.
A <a name="retention-policy"></a>_**retention policy**_ describes how long InfluxDB keeps data (`DURATION`) and how many copies of this data is stored in the cluster (`REPLICATION`).
If you're interested in reading more about retention policies, check out [Database Management](/influxdb/v1.8/query_language/database_management/#retention-policy-management).
{{% warn %}} Replication factors do not serve a purpose with single node instances.
{{% /warn %}}
In the sample data, everything in the `census` measurement belongs to the `autogen` retention policy.
InfluxDB automatically creates that retention policy; it has an infinite duration and a replication factor set to one.
Now that you're familiar with measurements, tag sets, and retention policies, let's discuss series.
In InfluxDB, a <a name=series></a>_**series**_ is a collection of points that share a measurement, tag set, and field key.
The data above consist of eight series:
| Series number | Measurement | Tag set | Field key |
|:------------------------ | ----------- | ------- | --------- |
| series 1 | `census` | `location = 1`,`scientist = langstroth` | `butterflies` |
| series 2 | `census` | `location = 2`,`scientist = langstroth` | `butterflies` |
| series 3 | `census` | `location = 1`,`scientist = perpetua` | `butterflies` |
| series 4 | `census` | `location = 2`,`scientist = perpetua` | `butterflies` |
| series 5 | `census` | `location = 1`,`scientist = langstroth` | `honeybees` |
| series 6 | `census` | `location = 2`,`scientist = langstroth` | `honeybees` |
| series 7 | `census` | `location = 1`,`scientist = perpetua` | `honeybees` |
| series 8 | `census` | `location = 2`,`scientist = perpetua` | `honeybees` |
Understanding the concept of a series is essential when designing your [schema](/influxdb/v1.8/concepts/glossary/#schema) and when working with your data in InfluxDB.
A <a name="point"></a>_**point**_ represents a single data record that has four components: a measurement, tag set, field set, and a timestamp. A point is uniquely identified by its series and timestamp.
For example, here's a single point:
```
name: census
-----------------
time butterflies honeybees location scientist
2015-08-18T00:00:00Z 1 30 1 perpetua
```
The point in this example is part of series 3 and defined by the measurement (`census`), the tag set (`location = 1`, `scientist = perpetua`), the field set (`butterflies = 1`, `honeybees = 30`), and the timestamp `2015-08-18T00:00:00Z`.
All of the stuff we've just covered is stored in a database - the sample data are in the database `my_database`.
An InfluxDB <a name=database></a>_**database**_ is similar to traditional relational databases and serves as a logical container for users, retention policies, continuous queries, and, of course, your time series data.
See [Authentication and Authorization](/influxdb/v1.8/administration/authentication_and_authorization/) and [Continuous Queries](/influxdb/v1.8/query_language/continuous_queries/) for more on those topics.
Databases can have several users, continuous queries, retention policies, and measurements.
InfluxDB is a schemaless database which means it's easy to add new measurements, tags, and fields at any time.
It's designed to make working with time series data awesome.
You made it!
You've covered the fundamental concepts and terminology in InfluxDB.
If you're just starting out, we recommend taking a look at [Getting Started](/influxdb/v1.8/introduction/getting_started/) and the [Writing Data](/influxdb/v1.8/guides/writing_data/) and [Querying Data](/influxdb/v1.8/guides/querying_data/) guides.
May our time series database serve you well 🕔.

View File

@ -0,0 +1,250 @@
---
title: InfluxDB schema design and data layout
description: Covers general guidelines for InfluxDB schema design and data layout.
menu:
influxdb_1_8:
name: Schema design and data layout
weight: 50
parent: Concepts
---
Every InfluxDB use case is special and your [schema](/influxdb/v1.8/concepts/glossary/#schema) will reflect that uniqueness.
There are, however, general guidelines to follow and pitfalls to avoid when designing your schema.
<table style="width:100%">
<tr>
<td><a href="#general-recommendations">General Recommendations</a></td>
<td><a href="#encouraged-schema-design">Encouraged Schema Design</a></td>
<td><a href="#discouraged-schema-design">Discouraged Schema Design</a></td>
<td><a href="#shard-group-duration-management">Shard Group Duration Management</a></td>
</tr>
</table>
## General recommendations
### Encouraged schema design
We recommend that you:
- [Encode meta data in tags](#encode-meta-data-in-tags)
- [Avoid using keywords as tag or field names](#avoid-using-keywords-as-tag-or-field-names)
#### Encode meta data in tags
[Tags](/influxdb/v1.8/concepts/glossary/#tag) are indexed and [fields](/influxdb/v1.8/concepts/glossary/#field) are not indexed.
This means that queries on tags are more performant than those on fields.
In general, your queries should guide what gets stored as a tag and what gets stored as a field:
- Store commonly-queried meta data in tags
- Store data in tags if you plan to use them with the InfluxQL `GROUP BY` clause
- Store data in fields if you plan to use them with an [InfluxQL](/influxdb/v1.8/query_language/functions/) function
- Store numeric values as fields ([tag values](/influxdb/v1.8/concepts/glossary/#tag-value) only support string values)
#### Avoid using keywords as tag or field names
Not required, but simplifies writing queries because you won't have to wrap tag or field names in double quotes.
See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux](https://github.com/influxdata/flux/blob/master/docs/SPEC.md#keywords) keywords to avoid.
Also, if a tag or field name contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/flux/latest/introduction/getting-started/syntax-basics/#objects) in Flux.
### Discouraged schema design
We recommend that you:
- [Avoid too many series](#avoid-too-many-series)
- [Avoid the same name for a tag and a field](#avoid-the-same-name-for-a-tag-and-a-field)
- [Avoid encoding data in measurement names](#avoid-encoding-data-in-measurement-names)
- [Avoid putting more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
#### Avoid too many series
[Tags](/influxdb/v1.8/concepts/glossary/#tag) containing highly variable information like UUIDs, hashes, and random strings lead to a large number of [series](/influxdb/v1.8/concepts/glossary/#series) in the database, also known as high series cardinality. High series cardinality is a primary driver of high memory usage for many database workloads.
See [Hardware sizing guidelines](/influxdb/v1.8/guides/hardware_sizing/#general-hardware-guidelines-for-a-single-node) for [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality) recommendations based on your hardware. If the system has memory constraints, consider storing high-cardinality data as a field rather than a tag.
#### Avoid the same name for a tag and a field
Avoid using the same name for a tag and field key.
This often results in unexpected behavior when querying data.
If you inadvertently add the same name for a tag and field key, see
[Frequently asked questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#tag-and-field-key-with-the-same-name)
for information about how to query the data predictably and how to fix the issue.
#### Avoid encoding data in measurement names
InfluxDB queries merge data that falls within the same [measurement](/influxdb/v1.8/concepts/glossary/#measurement); it's better to differentiate data with [tags](/influxdb/v1.8/concepts/glossary/#tag) than with detailed measurement names. If you encode data in a measurement name, you must use a regular expression to query the data, making some queries more complicated or impossible.
_Example:_
Consider the following schema represented by line protocol.
```
Schema 1 - Data encoded in the measurement name
-------------
blueberries.plot-1.north temp=50.1 1472515200000000000
blueberries.plot-2.midwest temp=49.8 1472515200000000000
```
The long measurement names (`blueberries.plot-1.north`) with no tags are similar to Graphite metrics.
Encoding the `plot` and `region` in the measurement name makes the data more difficult to query.
For example, calculating the average temperature of both plots 1 and 2 is not possible with schema 1.
Compare this to schema 2:
```
Schema 2 - Data encoded in tags
-------------
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
```
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region:
##### Flux
```js
// Schema 1 - Query for data encoded in the measurement name
from(bucket:"<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
|> mean()
// Schema 2 - Query for data encoded in tags
from(bucket:"<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
##### InfluxQL
```
# Schema 1 - Query for data encoded in the measurement name
> SELECT mean("temp") FROM /\.north$/
# Schema 2 - Query for data encoded in tags
> SELECT mean("temp") FROM "weather_sensor" WHERE "region" = 'north'
```
### Avoid putting more than one piece of information in one tag
Splitting a single tag with multiple pieces into separate tags simplifies your queries and reduces the need for regular expressions.
Consider the following schema represented by line protocol.
```
Schema 1 - Multiple data encoded in a single tag
-------------
weather_sensor,crop=blueberries,location=plot-1.north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,location=plot-2.midwest temp=49.8 1472515200000000000
```
The Schema 1 data encodes multiple separate parameters, the `plot` and `region` into a long tag value (`plot-1.north`).
Compare this to the following schema represented in line protocol.
```
Schema 2 - Data encoded in multiple tags
-------------
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
```
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region.
Schema 2 is preferable because using multiple tags, you don't need a regular expression.
##### Flux
```js
// Schema 1 - Query for multiple data encoded in a single tag
from(bucket:"<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
|> mean()
// Schema 2 - Query for data encoded in multiple tags
from(bucket:"<database>/<retention_policy>")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
##### InfluxQL
```
# Schema 1 - Query for multiple data encoded in a single tag
> SELECT mean("temp") FROM "weather_sensor" WHERE location =~ /\.north$/
# Schema 2 - Query for data encoded in multiple tags
> SELECT mean("temp") FROM "weather_sensor" WHERE region = 'north'
```
## Shard group duration management
### Shard group duration overview
InfluxDB stores data in shard groups.
Shard groups are organized by [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) (RP) and store data with timestamps that fall within a specific time interval called the [shard duration](/influxdb/v1.8/concepts/glossary/#shard-duration).
If no shard group duration is provided, the shard group duration is determined by the RP [duration](/influxdb/v1.8/concepts/glossary/#duration) at the time the RP is created. The default values are:
| RP Duration | Shard Group Duration |
|---|---|
| < 2 days | 1 hour |
| >= 2 days and <= 6 months | 1 day |
| > 6 months | 7 days |
The shard group duration is also configurable per RP.
To configure the shard group duration, see [Retention Policy Management](/influxdb/v1.8/query_language/database_management/#retention-policy-management).
### Shard group duration tradeoffs
Determining the optimal shard group duration requires finding the balance between:
- Better overall performance with longer shards
- Flexibility provided by shorter shards
#### Long shard group duration
Longer shard group durations let InfluxDB store more data in the same logical location.
This reduces data duplication, improves compression efficiency, and improves query speed in some cases.
#### Short shard group duration
Shorter shard group durations allow the system to more efficiently drop data and record incremental backups.
When InfluxDB enforces an RP it drops entire shard groups, not individual data points, even if the points are older than the RP duration.
A shard group will only be removed once a shard group's duration *end time* is older than the RP duration.
For example, if your RP has a duration of one day, InfluxDB will drop an hour's worth of data every hour and will always have 25 shard groups. One for each hour in the day and an extra shard group that is partially expiring, but isn't removed until the whole shard group is older than 24 hours.
>**Note:** A special use case to consider: filtering queries on schema data (such as tags, series, measurements) by time. For example, if you want to filter schema data within a one hour interval, you must set the shard group duration to 1h. For more information, see [filter schema data by time](/influxdb/v1.8/query_language/schema_exploration/#filter-meta-queries-by-time).
### Shard group duration recommendations
The default shard group durations work well for most cases. However, high-throughput or long-running instances will benefit from using longer shard group durations.
Here are some recommendations for longer shard group durations:
| RP Duration | Shard Group Duration |
|---|---|
| <= 1 day | 6 hours |
| > 1 day and <= 7 days | 1 day |
| > 7 days and <= 3 months | 7 days |
| > 3 months | 30 days |
| infinite | 52 weeks or longer |
> **Note:** Note that `INF` (infinite) is not a [valid shard group duration](/influxdb/v1.8/query_language/database_management/#retention-policy-management).
In extreme cases where data covers decades and will never be deleted, a long shard group duration like `1040w` (20 years) is perfectly valid.
Other factors to consider before setting shard group duration:
* Shard groups should be twice as long as the longest time range of the most frequent queries
* Shard groups should each contain more than 100,000 [points](/influxdb/v1.8/concepts/glossary/#point) per shard group
* Shard groups should each contain more than 1,000 points per [series](/influxdb/v1.8/concepts/glossary/#series)
#### Shard group duration for backfilling
Bulk insertion of historical data covering a large time range in the past will trigger the creation of a large number of shards at once.
The concurrent access and overhead of writing to hundreds or thousands of shards can quickly lead to slow performance and memory exhaustion.
When writing historical data, we highly recommend temporarily setting a longer shard group duration so fewer shards are created. Typically, a shard group duration of 52 weeks works well for backfilling.

View File

@ -0,0 +1,436 @@
---
title: In-memory indexing and the Time-Structured Merge Tree (TSM)
menu:
influxdb_1_8:
name: In-memory indexing with TSM
weight: 60
parent: Concepts
---
## The InfluxDB storage engine and the Time-Structured Merge Tree (TSM)
The InfluxDB storage engine looks very similar to a LSM Tree.
It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree.
TSM files contain sorted, compressed series data.
InfluxDB will create a [shard](/influxdb/v1.8/concepts/glossary/#shard) for each block of time.
For example, if you have a [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) with an unlimited duration, shards will be created for each 7 day block of time.
Each of these shards maps to an underlying storage engine database.
Each of these databases has its own [WAL](/influxdb/v1.8/concepts/glossary/#wal-write-ahead-log) and TSM files.
We'll dig into each of these parts of the storage engine.
## Storage engine
The storage engine ties a number of components together and provides the external interface for storing and querying series data. It is composed of a number of components that each serve a particular role:
* In-Memory Index - The in-memory index is a shared index across shards that provides the quick access to [measurements](/influxdb/v1.8/concepts/glossary/#measurement), [tags](/influxdb/v1.8/concepts/glossary/#tag), and [series](/influxdb/v1.8/concepts/glossary/#series). The index is used by the engine, but is not specific to the storage engine itself.
* WAL - The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable. Writes to the WAL are appended to segments of a fixed size.
* Cache - The Cache is an in-memory representation of the data stored in the WAL. It is queried at runtime and merged with the data stored in TSM files.
* TSM Files - TSM files store compressed series data in a columnar format.
* FileStore - The FileStore mediates access to all TSM files on disk. It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used.
* Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones.
* Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other.
* Compression - Compression is handled by various Encoders and Decoders for specific data types. Some encoders are fairly static and always encode the same type the same way; others switch their compression strategy based on the shape of the data.
* Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats.
### Write Ahead Log (WAL)
The WAL is organized as a bunch of files that look like `_000001.wal`.
The file numbers are monotonically increasing and referred to as WAL segments.
When a segment reaches 10MB in size, it is closed and a new one is opened. Each WAL segment stores multiple compressed blocks of writes and deletes.
When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file.
The file is `fsync`'d and the data is added to an in-memory index before a success is returned.
This means that batching points together is required to achieve high throughput performance.
(Optimal batch size seems to be 5,000-10,000 points per batch for many use cases.)
Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block.
### Cache
The Cache is an in-memory copy of all data points current stored in the WAL.
The points are organized by the key, which is the measurement, [tag set](/influxdb/v1.8/concepts/glossary/#tag-set), and unique [field](/influxdb/v1.8/concepts/glossary/#field).
Each field is kept as its own time-ordered range.
The Cache data is not compressed while in memory.
Queries to the storage engine will merge data from the Cache with data from the TSM files.
Queries execute on a copy of the data that is made from the cache at query processing time.
This way writes that come in while a query is running won't affect the result.
Deletes sent to the Cache will clear out the given key or the specific time range for the given key.
The Cache exposes a few controls for snapshotting behavior.
The two most important controls are the memory limits.
There is a lower bound, [`cache-snapshot-memory-size`](/influxdb/v1.8/administration/config#cache-snapshot-memory-size-25m), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments.
There is also an upper bound, [`cache-max-memory-size`](/influxdb/v1.8/administration/config#cache-max-memory-size-1g), which when exceeded will cause the Cache to reject new writes.
These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it.
The checks for memory thresholds occur on every write.
The other snapshot controls are time based.
The idle threshold, [`cache-snapshot-write-cold-duration`](/influxdb/v1.8/administration/config#cache-snapshot-write-cold-duration-10m), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval.
The in-memory Cache is recreated on restart by re-reading the WAL files on disk.
### TSM files
TSM files are a collection of read-only files that are memory mapped.
The structure of these files looks very similar to an SSTable in LevelDB or other LSM Tree variants.
A TSM file is composed of four sections: header, blocks, index, and footer.
```
+--------+------------------------------------+-------------+--------------+
| Header | Blocks | Index | Footer |
|5 bytes | N bytes | N bytes | 4 bytes |
+--------+------------------------------------+-------------+--------------+
```
The Header is a magic number to identify the file type and a version number.
```
+-------------------+
| Header |
+-------------------+
| Magic │ Version |
| 4 bytes │ 1 byte |
+-------------------+
```
Blocks are sequences of pairs of CRC32 checksums and data.
The block data is opaque to the file.
The CRC32 is used for block level error detection.
The length of the blocks is stored in the index.
```
+--------------------------------------------------------------------+
│ Blocks │
+---------------------+-----------------------+----------------------+
| Block 1 | Block 2 | Block N |
+---------------------+-----------------------+----------------------+
| CRC | Data | CRC | Data | CRC | Data |
| 4 bytes | N bytes | 4 bytes | N bytes | 4 bytes | N bytes |
+---------------------+-----------------------+----------------------+
```
Following the blocks is the index for the blocks in the file.
The index is composed of a sequence of index entries ordered lexicographically by key and then by time.
The key includes the measurement name, tag set, and one field.
Multiple fields per point creates multiple index entries in the TSM file.
Each index entry starts with a key length and the key, followed by the block type (float, int, bool, string) and a count of the number of index block entries that follow for that key.
Each index block entry is composed of the min and max time for the block, the offset into the file where the block is located and the size of the block. There is one index block entry for each block in the TSM file that contains the key.
The index structure can provide efficient access to all blocks as well as the ability to determine the cost associated with accessing a given key.
Given a key and timestamp, we can determine whether a file contains the block for that timestamp.
We can also determine where that block resides and how much data must be read to retrieve the block.
Knowing the size of the block, we can efficiently provision our IO statements.
```
+-----------------------------------------------------------------------------+
│ Index │
+-----------------------------------------------------------------------------+
│ Key Len │ Key │ Type │ Count │Min Time │Max Time │ Offset │ Size │...│
│ 2 bytes │ N bytes │1 byte│2 bytes│ 8 bytes │ 8 bytes │8 bytes │4 bytes │ │
+-----------------------------------------------------------------------------+
```
The last section is the footer that stores the offset of the start of the index.
```
+---------+
│ Footer │
+---------+
│Index Ofs│
│ 8 bytes │
+---------+
```
### Compression
Each block is compressed to reduce storage space and disk IO when querying.
A block contains the timestamps and values for a given series and field.
Each block has one byte header, followed by the compressed timestamps and then the compressed values.
```
+--------------------------------------------------+
| Type | Len | Timestamps | Values |
|1 Byte | VByte | N Bytes | N Bytes │
+--------------------------------------------------+
```
The timestamps and values are compressed and stored separately using encodings dependent on the data type and its shape.
Storing them independently allows timestamp encoding to be used for all timestamps, while allowing different encodings for different field types.
For example, some points may be able to use run-length encoding whereas other may not.
Each value type also contains a 1 byte header indicating the type of compression for the remaining bytes.
The four high bits store the compression type and the four low bits are used by the encoder if needed.
#### Timestamps
Timestamp encoding is adaptive and based on the structure of the timestamps that are encoded.
It uses a combination of delta encoding, scaling, and compression using simple8b run-length encoding, as well as falling back to no compression if needed.
Timestamp resolution is variable but can be as granular as a nanosecond, requiring up to 8 bytes to store uncompressed.
During encoding, the values are first delta-encoded.
The first value is the starting timestamp and subsequent values are the differences from the prior value.
This usually converts the values into much smaller integers that are easier to compress.
Many timestamps are also monotonically increasing and fall on even boundaries of time such as every 10s.
When timestamps have this structure, they are scaled by the largest common divisor that is also a factor of 10.
This has the effect of converting very large integer deltas into smaller ones that compress even better.
Using these adjusted values, if all the deltas are the same, the time range is stored using run-length encoding.
If run-length encoding is not possible and all values are less than (1 << 60) - 1 ([~36.5 years](https://www.wolframalpha.com/input/?i=\(1+%3C%3C+60\)+-+1+nanoseconds+to+years) at nanosecond resolution), then the timestamps are encoded using [simple8b encoding](https://github.com/jwilder/encoding/tree/master/simple8b).
Simple8b encoding is a 64bit word-aligned integer encoding that packs multiple integers into a single 64bit word.
If any value exceeds the maximum the deltas are stored uncompressed using 8 bytes each for the block.
Future encodings may use a patched scheme such as Patched Frame-Of-Reference (PFOR) to handle outliers more effectively.
#### Floats
Floats are encoded using an implementation of the [Facebook Gorilla paper](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf).
The encoding XORs consecutive values together to produce a small result when the values are close together.
The delta is then stored using control bits to indicate how many leading and trailing zeroes are in the XOR value.
Our implementation removes the timestamp encoding described in paper and only encodes the float values.
#### Integers
Integer encoding uses two different strategies depending on the range of values in the uncompressed data.
Encoded values are first encoded using [ZigZag encoding](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers).
This interleaves positive and negative integers across a range of positive integers.
For example, [-2,-1,0,1] becomes [3,1,0,2].
See Google's [Protocol Buffers documentation](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers) for more information.
If all ZigZag encoded values are less than (1 << 60) - 1, they are compressed using simple8b encoding.
If any values are larger than the maximum then all values are stored uncompressed in the block.
If all values are identical, run-length encoding is used.
This works very well for values that are frequently constant.
#### Booleans
Booleans are encoded using a simple bit packing strategy where each Boolean uses 1 bit.
The number of Booleans encoded is stored using variable-byte encoding at the beginning of the block.
#### Strings
Strings are encoding using [Snappy](http://google.github.io/snappy/) compression.
Each string is packed consecutively and they are compressed as one larger block.
### Compactions
Compactions are recurring processes that migrate data stored in a write-optimized format into a more read-optimized format.
There are a number of stages of compaction that take place while a shard is hot for writes:
* Snapshots - Values in the Cache and WAL must be converted to TSM files to free memory and disk space used by the WAL segments.
These compactions occur based on the cache memory and time thresholds.
* Level Compactions - Level compactions (levels 1-4) occur as the TSM files grow.
TSM files are compacted from snapshots to level 1 files.
Multiple level 1 files are compacted to produce level 2 files.
The process continues until files reach level 4 and the max size for a TSM file.
They will not be compacted further unless deletes, index optimization compactions, or full compactions need to run.
Lower level compactions use strategies that avoid CPU-intensive activities like decompressing and combining blocks.
Higher level (and thus less frequent) compactions will re-combine blocks to fully compact them and increase the compression ratio.
* Index Optimization - When many level 4 TSM files accumulate, the internal indexes become larger and more costly to access.
An index optimization compaction splits the series and indices across a new set of TSM files, sorting all points for a given series into one TSM file.
Before an index optimization, each TSM file contained points for most or all series, and thus each contains the same series index.
After an index optimization, each TSM file contains points from a minimum of series and there is little series overlap between files.
Each TSM file thus has a smaller unique series index, instead of a duplicate of the full series list.
In addition, all points from a particular series are contiguous in a TSM file rather than spread across multiple TSM files.
* Full Compactions - Full compactions run when a shard has become cold for writes for long time, or when deletes have occurred on the shard.
Full compactions produce an optimal set of TSM files and include all optimizations from Level and Index Optimization compactions.
Once a shard is fully compacted, no other compactions will run on it unless new writes or deletes are stored.
### Writes
Writes are appended to the current WAL segment and are also added to the Cache.
Each WAL segment has a maximum size.
Writes roll over to a new file once the current file fills up.
The cache is also size bounded; snapshots are taken and WAL compactions are initiated when the cache becomes too full.
If the inbound write rate exceeds the WAL compaction rate for a sustained period, the cache may become too full, in which case new writes will fail until the snapshot process catches up.
When WAL segments fill up and are closed, the Compactor snapshots the Cache and writes the data to a new TSM file.
When the TSM file is successfully written and `fsync`'d, it is loaded and referenced by the FileStore.
### Updates
Updates (writing a newer value for a point that already exists) occur as normal writes.
Since cached values overwrite existing values, newer writes take precedence.
If a write would overwrite a point in a prior TSM file, the points are merged at query runtime and the newer write takes precedence.
### Deletes
Deletes occur by writing a delete entry to the WAL for the measurement or series and then updating the Cache and FileStore.
The Cache evicts all relevant entries.
The FileStore writes a tombstone file for each TSM file that contains relevant data.
These tombstone files are used at startup time to ignore blocks as well as during compactions to remove deleted entries.
Queries against partially deleted series are handled at query time until a compaction removes the data fully from the TSM files.
### Queries
When a query is executed by the storage engine, it is essentially a seek to a given time associated with a specific series key and field.
First, we do a search on the data files to find the files that contain a time range matching the query as well containing matching series.
Once we have the data files selected, we next need to find the position in the file of the series key index entries.
We run a binary search against each TSM index to find the location of its index blocks.
In common cases the blocks will not overlap across multiple TSM files and we can search the index entries linearly to find the start block from which to read.
If there are overlapping blocks of time, the index entries are sorted to ensure newer writes will take precedence and that blocks can be processed in order during query execution.
When iterating over the index entries the blocks are read sequentially from the blocks section.
The block is decompressed and we seek to the specific point.
# The new InfluxDB storage engine: from LSM Tree to B+Tree and back again to create the Time Structured Merge Tree
Writing a new storage format should be a last resort.
So how did InfluxData end up writing our own engine?
InfluxData has experimented with many storage formats and found each lacking in some fundamental way.
The performance requirements for InfluxDB are significant, and eventually overwhelm other storage systems.
The 0.8 line of InfluxDB allowed multiple storage engines, including LevelDB, RocksDB, HyperLevelDB, and LMDB.
The 0.9 line of InfluxDB used BoltDB as the underlying storage engine.
This writeup is about the Time Structured Merge Tree storage engine that was released in 0.9.5 and is the only storage engine supported in InfluxDB 0.11+, including the entire 1.x family.
The properties of the time series data use case make it challenging for many existing storage engines.
Over the course of InfluxDB development, InfluxData tried a few of the more popular options.
We started with LevelDB, an engine based on LSM Trees, which are optimized for write throughput.
After that we tried BoltDB, an engine based on a memory mapped B+Tree, which is optimized for reads.
Finally, we ended up building our own storage engine that is similar in many ways to LSM Trees.
With our new storage engine we were able to achieve up to a 45x reduction in disk space usage from our B+Tree setup with even greater write throughput and compression than what we saw with LevelDB and its variants.
This post will cover the details of that evolution and end with an in-depth look at our new storage engine and its inner workings.
## Properties of time series data
The workload of time series data is quite different from normal database workloads.
There are a number of factors that conspire to make it very difficult to scale and remain performant:
* Billions of individual data points
* High write throughput
* High read throughput
* Large deletes (data expiration)
* Mostly an insert/append workload, very few updates
The first and most obvious problem is one of scale.
In DevOps, IoT, or APM it is easy to collect hundreds of millions or billions of unique data points every day.
For example, let's say we have 200 VMs or servers running, with each server collecting an average of 100 measurements every 10 seconds.
Given there are 86,400 seconds in a day, a single measurement will generate 8,640 points in a day per server.
That gives us a total of 172,800,000 (`200 * 100 * 8,640`) individual data points per day.
We find similar or larger numbers in sensor data use cases.
The volume of data means that the write throughput can be very high.
We regularly get requests for setups than can handle hundreds of thousands of writes per second.
Some larger companies will only consider systems that can handle millions of writes per second.
At the same time, time series data can be a high read throughput use case.
It's true that if you're tracking 700,000 unique metrics or time series you can't hope to visualize all of them.
That leads many people to think that you don't actually read most of the data that goes into the database.
However, other than dashboards that people have up on their screens, there are automated systems for monitoring or combining the large volume of time series data with other types of data.
Inside InfluxDB, aggregate functions calculated on the fly may combine tens of thousands of distinct time series into a single view.
Each one of those queries must read each aggregated data point, so for InfluxDB the read throughput is often many times higher than the write throughput.
Given that time series is mostly an append-only workload, you might think that it's possible to get great performance on a B+Tree.
Appends in the keyspace are efficient and you can achieve greater than 100,000 per second.
However, we have those appends happening in individual time series.
So the inserts end up looking more like random inserts than append only inserts.
One of the biggest problems we found with time series data is that it's very common to delete all data after it gets past a certain age.
The common pattern here is that users have high precision data that is kept for a short period of time like a few days or months.
Users then downsample and aggregate that data into lower precision rollups that are kept around much longer.
The naive implementation would be to simply delete each record once it passes its expiration time.
However, that means that once the first points written reach their expiration date, the system is processing just as many deletes as writes, which is something most storage engines aren't designed for.
Let's dig into the details of the two types of storage engines we tried and how these properties had a significant impact on our performance.
## LevelDB and log structured merge trees
When the InfluxDB project began, we picked LevelDB as the storage engine because we had used it for time series data storage in the product that was the precursor to InfluxDB.
We knew that it had great properties for write throughput and everything seemed to "just work".
LevelDB is an implementation of a log structured merge tree (LSM tree) that was built as an open source project at Google.
It exposes an API for a key-value store where the key space is sorted.
This last part is important for time series data as it allowed us to quickly scan ranges of time as long as the timestamp was in the key.
LSM Trees are based on a log that takes writes and two structures known as Mem Tables and SSTables.
These tables represent the sorted keyspace.
SSTables are read only files that are continuously replaced by other SSTables that merge inserts and updates into the keyspace.
The two biggest advantages that LevelDB had for us were high write throughput and built in compression.
However, as we learned more about what people needed with time series data, we encountered a few insurmountable challenges.
The first problem we had was that LevelDB doesn't support hot backups.
If you want to do a safe backup of the database, you have to close it and then copy it.
The LevelDB variants RocksDB and HyperLevelDB fix this problem, but there was another more pressing problem that we didn't think they could solve.
Our users needed a way to automatically manage data retention.
That meant we needed deletes on a very large scale.
In LSM Trees, a delete is as expensive, if not more so, than a write.
A delete writes a new record known as a tombstone.
After that queries merge the result set with any tombstones to purge the deleted data from the query return.
Later, a compaction runs that removes the tombstone record and the underlying deleted record in the SSTable file.
To get around doing deletes, we split data across what we call shards, which are contiguous blocks of time.
Shards would typically hold either one day or seven days worth of data.
Each shard mapped to an underlying LevelDB.
This meant that we could drop an entire day of data by just closing out the database and removing the underlying files.
Users of RocksDB may at this point bring up a feature called ColumnFamilies.
When putting time series data into Rocks, it's common to split blocks of time into column families and then drop those when their time is up.
It's the same general idea: create a separate area where you can just drop files instead of updating indexes when you delete a large block of data.
Dropping a column family is a very efficient operation.
However, column families are a fairly new feature and we had another use case for shards.
Organizing data into shards meant that it could be moved within a cluster without having to examine billions of keys.
At the time of this writing, it was not possible to move a column family in one RocksDB to another.
Old shards are typically cold for writes so moving them around would be cheap and easy.
We would have the added benefit of having a spot in the keyspace that is cold for writes so it would be easier to do consistency checks later.
The organization of data into shards worked great for a while, until a large amount of data went into InfluxDB.
LevelDB splits the data out over many small files.
Having dozens or hundreds of these databases open in a single process ended up creating a big problem.
Users that had six months or a year of data would run out of file handles.
It's not something we found with the majority of users, but anyone pushing the database to its limits would hit this problem and we had no fix for it.
There were simply too many file handles open.
## BoltDB and mmap B+Trees
After struggling with LevelDB and its variants for a year we decided to move over to BoltDB, a pure Golang database heavily inspired by LMDB, a mmap B+Tree database written in C.
It has the same API semantics as LevelDB: a key value store where the keyspace is ordered.
Many of our users were surprised.
Our own posted tests of the LevelDB variants vs. LMDB (a mmap B+Tree) showed RocksDB as the best performer.
However, there were other considerations that went into this decision outside of the pure write performance.
At this point our most important goal was to get to something stable that could be run in production and backed up.
BoltDB also had the advantage of being written in pure Go, which simplified our build chain immensely and made it easy to build for other OSes and platforms.
The biggest win for us was that BoltDB used a single file as the database.
At this point our most common source of bug reports were from people running out of file handles.
Bolt solved the hot backup problem and the file limit problems all at the same time.
We were willing to take a hit on write throughput if it meant that we'd have a system that was more reliable and stable that we could build on.
Our reasoning was that for anyone pushing really big write loads, they'd be running a cluster anyway.
We released versions 0.9.0 to 0.9.2 based on BoltDB.
From a development perspective it was delightful.
Clean API, fast and easy to build in our Go project, and reliable.
However, after running for a while we found a big problem with write throughput.
After the database got over a few GB, writes would start spiking IOPS.
Some users were able to get past this by putting InfluxDB on big hardware with near unlimited IOPS.
However, most users are on VMs with limited resources in the cloud.
We had to figure out a way to reduce the impact of writing a bunch of points into hundreds of thousands of series at a time.
With the 0.9.3 and 0.9.4 releases our plan was to put a write ahead log (WAL) in front of Bolt.
That way we could reduce the number of random insertions into the keyspace.
Instead, we'd buffer up multiple writes that were next to each other and then flush them at once.
However, that only served to delay the problem.
High IOPS still became an issue and it showed up very quickly for anyone operating at even moderate work loads.
However, our experience building the first WAL implementation in front of Bolt gave us the confidence we needed that the write problem could be solved.
The performance of the WAL itself was fantastic, the index simply could not keep up.
At this point we started thinking again about how we could create something similar to an LSM Tree that could keep up with our write load.
Thus was born the Time Structured Merge Tree.

View File

@ -0,0 +1,50 @@
---
title: Time Series Index (TSI) overview
menu:
influxdb_1_8:
name: Time Series Index (TSI) overview
weight: 70
parent: Concepts
---
Find overview and background information on Time Series Index (TSI) in this topic. For detail, including how to enable and configure TSI, see [Time Series Index (TSI) details](https://docs.influxdata.com/influxdb/v1.8/concepts/tsi-details/).
## Overview
To support a large number of time series, that is, a very high cardinality in the number of unique time series that the database stores, InfluxData has added the new Time Series Index (TSI).
InfluxData supports customers using InfluxDB with tens of millions of time series.
InfluxData's goal, however, is to expand to hundreds of millions, and eventually billions.
Using InfluxData's TSI storage engine, users should be able to have millions of unique time series.
The goal is that the number of series should be unbounded by the amount of memory on the server hardware.
Importantly, the number of series that exist in the database will have a negligible impact on database startup time.
This work represents the most significant technical advancement in the database since InfluxData released the Time Series Merge Tree (TSM) storage engine in 2016.
## Background information
InfluxDB actually looks like two databases in one, a time series data store and an inverted index for the measurement, tag, and field metadata.
### Time-Structured Merge Tree (TSM)
The Time-Structured Merge Tree (TSM) engine that InfluxData built in 2015 and continued enhancing in 2016 was an effort to solve the problem of getting maximum throughput, compression, and query speed for raw time series data.
Up until TSI, the inverted index was an in-memory data structure that was built during startup of the database based on the data in TSM.
This meant that for every measurement, tag key-value pair, and field name, there was a lookup table in-memory to map those bits of metadata to an underlying time series.
For users with a high number of ephemeral series, memory utilization continued increasing as new time series were created.
And, startup times increased since all of that data would have to be loaded onto the heap at start time.
> For details, see [TSM-based data storage and in-memory indexing](/influxdb/v1.8/concepts/storage_engine/).
### Time Series Index (TSI)
The new time series index (TSI) moves the index to files on disk that we memory map.
This means that we let the operating system handle being the Least Recently Used (LRU) memory.
Much like the TSM engine for raw time series data we have a write-ahead log with an in-memory structure that gets merged at query time with the memory-mapped index.
Background routines run constantly to compact the index into larger and larger files to avoid having to do too many index merges at query time.
Under the covers, were using techniques like Robin Hood Hashing to do fast index lookups and HyperLogLog++ to keep sketches of cardinality estimates.
The latter will give us the ability to add things to the query languages like the [SHOW CARDINALITY](/influxdb/v1.8/query_language/spec#show-cardinality) queries.
### Issues solved by TSI and remaining to be solved
The primary issue that Time Series Index (TSI) addresses is ephemeral time series. Most frequently, this occurs in use cases that want to track per process metrics or per container metrics by putting identifiers in tags. For example, the [Heapster project for Kubernetes](https://github.com/kubernetes/heapster) does this. For series that are no longer hot for writes or queries, they wont take up space in memory.
The issue that the Heapster project and similar use cases did not address is limiting the scope of data returned by the SHOW queries. Well have updates to the query language in the future to limit those results by time. We also dont solve the problem of having all these series hot for reads and writes. For that problem, scale-out clustering is the solution. Well have to continue to optimize the query language and engine to work with large sets of series. Well need to add guard rails and limits into the language and eventually, add spill-to-disk query processing. That work will be on-going in every release of InfluxDB.

View File

@ -0,0 +1,172 @@
---
title: Time Series Index (TSI) details
menu:
influxdb_1_8:
name: Time Series Index (TSI) details
weight: 80
parent: Concepts
---
When InfluxDB ingests data, we store not only the value but we also index the measurement and tag information so that it can be queried quickly.
In earlier versions, index data could only be stored in-memory, however, that requires a lot of RAM and places an upper bound on the number of series a machine can hold.
This upper bound is usually somewhere between 1 - 4 million series depending on the machine used.
The Time Series Index (TSI) was developed to allow us to go past that upper bound.
TSI stores index data on disk so that we are no longer restricted by RAM.
TSI uses the operating system's page cache to pull hot data into memory and let cold data rest on disk.
## Enable TSI
To enable TSI, set the following line in the InfluxDB configuration file (`influxdb.conf`):
```
index-version = "tsi1"
```
(Be sure to include the double quotes.)
### InfluxDB Enterprise
- To convert your data nodes to support TSI, see [Upgrade InfluxDB Enterprise clusters](https://docs.influxdata.com/enterprise_influxdb/v1.8/administration/upgrading/).
- For detail on configuration, see [Configure InfluxDB Enterprise clusters](https://docs.influxdata.com/enterprise_influxdb/v1.8/administration/configuration/#sidebar).
### InfluxDB OSS
- For detail on configuration, see [Configuring InfluxDB OSS](https://docs.influxdata.com/influxdb/v1.8/administration/config/#sidebar).
## Tooling
### `influx_inspect dumptsi`
If you are troubleshooting an issue with an index, you can use the `influx_inspect dumptsi` command.
This command allows you to print summary statistics on an index, file, or a set of files.
This command only works on one index at a time.
For details on this command, see [influx_inspect dumptsi](/influxdb/v1.8/tools/influx_inspect/#dumptsi).
### `influx_inspect buildtsi`
If you want to convert an existing shard from an in-memory index to a TSI index, or if you have an existing TSI index which has become corrupt, you can use the `buildtsi` command to create the index from the underlying TSM data.
If you have an existing TSI index that you want to rebuild, first delete the `index` directory within your shard.
This command works at the server-level but you can optionally add database, retention policy and shard filters to only apply to a subset of shards.
For details on this command, see [influx inspect buildtsi](/influxdb/v1.8/tools/influx_inspect/#buildtsi).
## Understanding TSI
### File organization
TSI (Time Series Index) is a log-structured merge tree-based database for InfluxDB series data.
TSI is composed of several parts:
* **Index**: Contains the entire index dataset for a single shard.
* **Partition**: Contains a sharded partition of the data for a shard.
* **LogFile**: Contains newly written series as an in-memory index and is persisted as a WAL.
* **IndexFile**: Contains an immutable, memory-mapped index built from a LogFile or merged from two contiguous index files.
There is also a **SeriesFile** which contains a set of all series keys across the entire database.
Each shard within the database shares the same series file.
### Writes
The following occurs when a write comes into the system:
1. Series is added to the series file or is looked up if it already exists. This returns an auto-incrementing series ID.
2. The series is sent to the Index. The index maintains a roaring bitmap of existing series IDs and ignores series that have already been created.
3. The series is hashed and sent to the appropriate Partition.
4. The Partition writes the series as an entry to the LogFile.
5. The LogFile writes the series to a write-ahead log file on disk and adds the series to a set of in-memory indexes.
### Compaction
Once the LogFile exceeds a threshold (5MB), then a new active log file is created and the previous one begins compacting into an IndexFile.
This first index file is at level 1 (L1).
The log file is considered level 0 (L0).
Index files can also be created by merging two smaller index files together.
For example, if contiguous two L1 index files exist then they can be merged into an L2 index file.
### Reads
The index provides several API calls for retrieving sets of data such as:
* `MeasurementIterator()`: Returns a sorted list of measurement names.
* `TagKeyIterator()`: Returns a sorted list of tag keys in a measurement.
* `TagValueIterator()`: Returns a sorted list of tag values for a tag key.
* `MeasurementSeriesIDIterator()`: Returns a sorted list of all series IDs for a measurement.
* `TagKeySeriesIDIterator()`: Returns a sorted list of all series IDs for a tag key.
* `TagValueSeriesIDIterator()`: Returns a sorted list of all series IDs for a tag value.
These iterators are all composable using several merge iterators.
For each type of iterator (measurement, tag key, tag value, series id), there are multiple merge iterator types:
* **Merge**: Deduplicates items from two iterators.
* **Intersect**: Returns only items that exist in two iterators.
* **Difference**: Only returns items from first iterator that don't exist in the second iterator.
For example, a query with a WHERE clause of `region != 'us-west'` that operates across two shards will construct a set of iterators like this:
```
DifferenceSeriesIDIterators(
MergeSeriesIDIterators(
Shard1.MeasurementSeriesIDIterator("m"),
Shard2.MeasurementSeriesIDIterator("m"),
),
MergeSeriesIDIterators(
Shard1.TagValueSeriesIDIterator("m", "region", "us-west"),
Shard2.TagValueSeriesIDIterator("m", "region", "us-west"),
),
)
```
### Log File Structure
The log file is simply structured as a list of LogEntry objects written to disk in sequential order. Log files are written until they reach 5MB and then they are compacted into index files.
The entry objects in the log can be of any of the following types:
* AddSeries
* DeleteSeries
* DeleteMeasurement
* DeleteTagKey
* DeleteTagValue
The in-memory index on the log file tracks the following:
* Measurements by name
* Tag keys by measurement
* Tag values by tag key
* Series by measurement
* Series by tag value
* Tombstones for series, measurements, tag keys, and tag values.
The log file also maintains bitsets for series ID existence and tombstones.
These bitsets are merged with other log files and index files to regenerate the full index bitset on startup.
### Index File Structure
The index file is an immutable file that tracks similar information to the log file, but all data is indexed and written to disk so that it can be directly accessed from a memory-map.
The index file has the following sections:
* **TagBlocks:** Maintains an index of tag values for a single tag key.
* **MeasurementBlock:** Maintains an index of measurements and their tag keys.
* **Trailer:** Stores offset information for the file as well as HyperLogLog sketches for cardinality estimation.
### Manifest
The MANIFEST file is stored in the index directory and lists all the files that belong to the index and the order in which they should be accessed.
This file is updated every time a compaction occurs.
Any files that are in the directory that are not in the index file are index files that are in the process of being compacted.
### FileSet
A file set is an in-memory snapshot of the manifest that is obtained while the InfluxDB process is running.
This is required to provide a consistent view of the index at a point-in-time.
The file set also facilitates reference counting for all of its files so that no file will be deleted via compaction until all readers of the file are done with it.

View File

@ -0,0 +1,22 @@
---
title: Diamond
---
## Saving Diamond Metrics into InfluxDB
Diamond is a metrics collection and delivery daemon written in Python.
It is capable of collecting cpu, memory, network, i/o, load and disk metrics.
Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source.
[Diamond homepage](https://github.com/python-diamond)
Diamond started supporting InfluxDB at version 3.5.
## Configuring Diamond to send metrics to InfluxDB
Prerequisites: Diamond depends on the [influxdb python client](https://github.com/influxdb/influxdb-python).
InfluxDB-version-specific installation instructions for the influxdb python client can be found on their [github page](https://github.com/influxdb/influxdb-python).
[Diamond InfluxdbHandler configuration page](https://github.com/python-diamond/Diamond/wiki/handler-InfluxdbHandler)

View File

@ -0,0 +1,19 @@
---
title: OpenTSDB
---
InfluxDB supports the OpenTSDB ["telnet" protocol](http://opentsdb.net/docs/build/html/user_guide/writing/index.html#telnet).
When OpenTSDB support is enabled, InfluxDB can act as a drop-in replacement for your OpenTSDB system.
An example input point, and how it is processed, is shown below.
```
put sys.cpu.user 1356998400 42.5 host=webserver01 cpu=0
```
When InfluxDB receives this data, a point is written to the database.
The point's Measurement is `sys.cpu.user`, the timestamp is `1356998400`, and the value is `42.5`.
The point is also tagged with `host=webserver01` and `cpu=0`.
Tags allow fast and efficient queries to be performed on your data.
To learn more about enabling OpenTSDB support, check the example [configuration file](https://github.com/influxdb/influxdb/blob/1.8/etc/config.sample.toml).

View File

@ -0,0 +1,32 @@
---
title: External resources
---
But wait, there's more!
Check out these resources to learn more about InfluxDB.
## [InfluxData blog](https://www.influxdata.com/blog/)
Check out the InfluxData Blog for announcements, updates, and
weekly [tech tips](https://www.influxdata.com/category/tech-tips/).
## [Technical papers](https://www.influxdata.com/_resources/techpapers-new/)
InfluxData's Technical Papers series offer in-depth analysis on performance, time series,
and benchmarking InfluxDB vs. other popular databases.
## [Meetup videos](https://www.influxdata.com/_resources/videosnew//)
Check out our growing Meetup videos collection for introductory content, how-tos, and more.
## [Virtual training videos](https://www.influxdata.com/_resources/videosnew/)
Watch the videos from our weekly training webinar.
## [Virtual training schedule](https://www.influxdata.com/virtual-training-courses/)
Check out our virtual training schedule to register for future webinars.
## [InfluxData events](https://www.influxdata.com/events/)
Find out what's happening at InfluxData and sign up for upcoming events.

View File

@ -0,0 +1,10 @@
---
title: InfluxDB guides
menu:
influxdb_1_8:
name: Guides
weight: 40
---
{{< children type="list">}}

View File

@ -0,0 +1,264 @@
---
title: Calculate percentages in a query
description: Calculate percentages using basic math operators available in InfluxQL or Flux. This guide walks through use-cases and examples of calculating percentages from two values in a single query.
menu:
influxdb_1_8:
weight: 50
parent: Guides
name: Calculate percentages
aliases:
- /influxdb/v1.8/guides/calculating_percentages/
---
Use Flux or InfluxQL to calculate percentages in a query.
{{< tabs-wrapper >}}
{{% tabs %}}
[Flux](#)
[InfluxQL](#)
{{% /tabs %}}
{{% tab-content %}}
[Flux](/flux/latest/) lets you perform simple math equations, for example, calculating a percentage.
## Calculate a percentage
Learn how to calculate a percentage using the following examples:
- [Basic calculations within a query](#basic-calculations-within-a-query)
- [Calculate a percentage from two fields](#calculate-a-percentage-from-two-fields)
- [Calculate a percentage using aggregate functions](#calculate-a-percentage-using-aggregate-functions)
- [Calculate the percentage of total weight per apple variety](#calculate-the-percentage-of-total-weight-per-apple-variety)
- [Calculate the aggregate percentage per variety](#calculate-the-aggregate-percentage-per-variety)
## Basic calculations within a query
When performing any math operation in a Flux query, you must complete the following steps:
1. Specify the [bucket](/flux/latest/introduction/getting-started/#buckets) to query from and the time range to query.
2. Filter your data by measurements, fields, and other applicable criteria.
3. Align values in one row (required to perform math in Flux) by using one of the following functions:
- To query **from multiple** data sources, use the [`join()` function](flux/latest/stdlib/built-in/transformations/join).
- To query **from the same** data source, use the [`pivot()` function](/flux/latest/stdlib/built-in/transformations/pivot/).
For examples using the `join()` function to calculate percentages and more examples of calculating percentages, see [Calculate percentages with Flux](/flux/latest/guides/calculate-percentages/).
#### Data variable
To shorten examples, we'll store a basic Flux query in a `data` variable for reuse.
Here's how that looks in Flux:
```js
// Query data from the past 15 minutes pivot fields into columns so each row
// contains values for each field
data = from(bucket:"your_db/your_retention_policy")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /field[1-2]/)
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
```
Each row now contains the values necessary to perform a math operation. For example, to add two field keys, start with the `data` variable created above, and then use `map()` to re-map values in each row.
```js
data
|> map(fn: (r) => ({ r with _value: r.field1 + r.field2}))
```
> **Note:** Flux supports basic math operators such as `+`,`-`,`/`, `*`, and `()`. For example, to subtract `field2` from `field1`, change `+` to `-`.
## Calculate a percentage from two fields
Use the `data` variable created above, and then use the [`map()` function](/flux/latest/stdlib/built-in/transformations) to divide one field by another, multiply by 100, and add a new `percent` field to store the percentage values in.
```js
data
|> map(fn: (r) => ({
_time: r._time,
_measurement: r._measurement,
_field: "percent",
_value: field1 / field2 * 100.0
}))
```
>**Note:** In this example, `field1` and `field2` are float values, hence multiplied by 100.0. For integer values, multiply by 100 or use the `float()` function to cast integers to floats.
## Calculate a percentage using aggregate functions
Use [`aggregateWindow()`](/flux/latest/stdlib/built-in/transformations/aggregates/aggregatewindow) to window data by time and perform an aggregate function on each window.
```js
from(bucket:"<database>/<retention_policy>")
|> range(start: -15m)
|> filter(fn: (r) => r._measurement == "measurement_name" and r._field =~ /fieldkey[1-2]/)
|> aggregateWindow(every: 1m, fn:sum)
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({ r with _value: r.field1 / r.field2 * 100.0 }))
```
## Calculate the percentage of total weight per apple variety
Use simulated apple stand data to track the weight of apples (by type) throughout a day.
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
2. Import the sample data:
```bash
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
```
Use the following query to calculate the percentage of the total weight each variety
accounts for at each given point in time.
```js
from(bucket:"apple_stand/autogen")
|> range(start: 2018-06-18T12:00:00Z, stop: 2018-06-19T04:35:00Z)
|> filter(fn: (r) => r._measurement == "variety")
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({ r with
granny_smith: r.granny_smith / r.total_weight * 100.0 ,
golden_delicious: r.golden_delicious / r.total_weight * 100.0 ,
fuji: r.fuji / r.total_weight * 100.0 ,
gala: r.gala / r.total_weight * 100.0 ,
braeburn: r.braeburn / r.total_weight * 100.0 ,}))
```
## Calculate the average percentage of total weight per variety each hour
With the apple stand data from the prior example, use the following query to calculate the average percentage of the total weight each variety accounts for per hour.
```js
from(bucket:"apple_stand/autogen")
|> range(start: 2018-06-18T00:00:00.00Z, stop: 2018-06-19T16:35:00.00Z)
|> filter(fn: (r) => r._measurement == "variety")
|> aggregateWindow(every:1h, fn: mean)
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> map(fn: (r) => ({ r with
granny_smith: r.granny_smith / r.total_weight * 100.0,
golden_delicious: r.golden_delicious / r.total_weight * 100.0,
fuji: r.fuji / r.total_weight * 100.0,
gala: r.gala / r.total_weight * 100.0,
braeburn: r.braeburn / r.total_weight * 100.0
}))
```
{{% /tab-content %}}
{{% tab-content %}}
[InfluxQL](/influxdb/v1.8/query_language/) lets you perform simple math equations
which makes calculating percentages using two fields in a measurement pretty simple.
However there are some caveats of which you need to be aware.
## Basic calculations within a query
`SELECT` statements support the use of basic math operators such as `+`,`-`,`/`, `*`, `()`, etc.
```sql
-- Add two field keys
SELECT field_key1 + field_key2 AS "field_key_sum" FROM "measurement_name" WHERE time < now() - 15m
-- Subtract one field from another
SELECT field_key1 - field_key2 AS "field_key_difference" FROM "measurement_name" WHERE time < now() - 15m
-- Grouping and chaining mathematical calculations
SELECT (field_key1 + field_key2) - (field_key3 + field_key4) AS "some_calculation" FROM "measurement_name" WHERE time < now() - 15m
```
## Calculating a percentage in a query
Using basic math functions, you can calculate a percentage by dividing one field value
by another and multiplying the result by 100:
```sql
SELECT (field_key1 / field_key2) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m
```
## Calculating a percentage using aggregate functions
If using aggregate functions in your percentage calculation, all data must be referenced
using aggregate functions.
_**You can't mix aggregate and non-aggregate data.**_
All Aggregate functions need a `GROUP BY time()` clause defining the time intervals
in which data points are grouped and aggregated.
```sql
SELECT (sum(field_key1) / sum(field_key2)) * 100 AS "calculated_percentage" FROM "measurement_name" WHERE time < now() - 15m GROUP BY time(1m)
```
## Examples
#### Sample data
The following example uses simulated Apple Stand data that tracks the weight of
baskets containing different varieties of apples throughout a day of business.
1. [Download the sample data](https://gist.githubusercontent.com/sanderson/8f8aec94a60b2c31a61f44a37737bfea/raw/c29b239547fa2b8ee1690f7d456d31f5bd461386/apple_stand.txt)
2. Import the sample data:
```bash
influx -import -path=path/to/apple_stand.txt -precision=ns -database=apple_stand
```
### Calculating percentage of total weight per apple variety
The following query calculates the percentage of the total weight each variety
accounts for at each given point in time.
```sql
SELECT
("braeburn"/total_weight)*100,
("granny_smith"/total_weight)*100,
("golden_delicious"/total_weight)*100,
("fuji"/total_weight)*100,
("gala"/total_weight)*100
FROM "apple_stand"."autogen"."variety"
```
<div class='view-in-chronograf' data-query-override='SELECT
("braeburn"/total_weight)*100,
("granny_smith"/total_weight)*100,
("golden_delicious"/total_weight)*100,
("fuji"/total_weight)*100,
("gala"/total_weight)*100
FROM "apple_stand"."autogen"."variety"'>
\*</div>
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
in Chronograf, it would look like:
![Percentage of total per apple variety](/img/influxdb/calc-percentage-apple-variety.png)
### Calculating aggregate percentage per variety
The following query calculates the average percentage of the total weight each variety
accounts for per hour.
```sql
SELECT
(mean("braeburn")/mean(total_weight))*100,
(mean("granny_smith")/mean(total_weight))*100,
(mean("golden_delicious")/mean(total_weight))*100,
(mean("fuji")/mean(total_weight))*100,
(mean("gala")/mean(total_weight))*100
FROM "apple_stand"."autogen"."variety"
WHERE time >= '2018-06-18T12:00:00Z' AND time <= '2018-06-19T04:35:00Z'
GROUP BY time(1h)
```
<div class='view-in-chronograf' data-query-override='SELECT%0A%20%20%20%20%28mean%28"braeburn"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"granny_smith"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"golden_delicious"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"fuji"%29%2Fmean%28total_weight%29%29%2A100%2C%0A%20%20%20%20%28mean%28"gala"%29%2Fmean%28total_weight%29%29%2A100%0AFROM%20"apple_stand"."autogen"."variety"%0AWHERE%20time%20>%3D%20%272018-06-18T12%3A00%3A00Z%27%20AND%20time%20<%3D%20%272018-06-19T04%3A35%3A00Z%27%0AGROUP%20BY%20time%281h%29'></div>
_**Note the following about this query:**_
- It uses aggregate functions (`mean()`) for pulling all data.
- It includes a `GROUP BY time()` clause which aggregates data into 1 hour blocks.
- It includes an explicitly limited time window. Without it, aggregate functions
are very resource-intensive.
If visualized as a [stacked graph](/chronograf/v1.8/guides/visualization-types/#stacked-graph)
in Chronograf, it would look like:
![Hourly average percentage of total per apple variety](/img/influxdb/calc-percentage-hourly-apple-variety.png)
{{% /tab-content %}}
{{< /tabs-wrapper >}}

View File

@ -0,0 +1,222 @@
---
title: Downsample and retain data
menu:
influxdb_1_8:
weight: 30
parent: Guides
aliases:
- /influxdb/v1.8/guides/downsampling_and_retention/
---
InfluxDB can handle hundreds of thousands of data points per second. Working with that much data over a long period of time can create storage concerns.
A natural solution is to downsample the data; keep the high precision raw data for only a limited time, and store the lower precision, summarized data longer.
This guide describes how to automate the process of downsampling data and expiring old data using InfluxQL. To downsample and retain data using Flux and InfluxDB 2.0,
see [Process Data with InfluxDB tasks](https://v2.docs.influxdata.com/v2.0/process-data/).
### Definitions
- **Continuous query** (CQ) is an InfluxQL query that runs automatically and periodically within a database.
CQs require a function in the `SELECT` clause and must include a `GROUP BY time()` clause.
- **Retention policy** (RP) is the part of InfluxDB data structure that describes for how long InfluxDB keeps data.
InfluxDB compares your local server's timestamp to the timestamps on your data and deletes data older than the RP's `DURATION`.
A single database can have several RPs and RPs are unique per database.
This guide doesn't go into detail about the syntax for creating and managing CQs and RPs or tasks.
If you're new to these concepts, we recommend reviewing the following:
- [CQ documentation](/influxdb/v1.8/query_language/continuous_queries/) and
- [RP documentation](/influxdb/v1.8/query_language/database_management/#retention-policy-management).
### Sample data
This section uses fictional real-time data to track the number of food orders
to a restaurant via phone and via website at ten second intervals.
We store this data in a [database](/influxdb/v1.8/concepts/glossary/#database) or [bucket]() called `food_data`, in
the [measurement](/influxdb/v1.8/concepts/glossary/#measurement) `orders`, and
in the [fields](/influxdb/v1.8/concepts/glossary/#field) `phone` and `website`.
Sample:
```bash
name: orders
------------
time phone website
2016-05-10T23:18:00Z 10 30
2016-05-10T23:18:10Z 12 39
2016-05-10T23:18:20Z 11 56
```
### Goal
Assume that, in the long run, we're only interested in the average number of orders by phone
and by website at 30 minute intervals.
In the next steps, we use RPs and CQs to:
* Automatically aggregate the ten-second resolution data to 30-minute resolution data
* Automatically delete the raw, ten-second resolution data that are older than two hours
* Automatically delete the 30-minute resolution data that are older than 52 weeks
### Database preparation
We perform the following steps before writing the data to the database
`food_data`.
We do this **before** inserting any data because CQs only run against recent
data; that is, data with timestamps that are no older than `now()` minus
the `FOR` clause of the CQ, or `now()` minus the `GROUP BY time()` interval if
the CQ has no `FOR` clause.
#### 1. Create the database
```sql
> CREATE DATABASE "food_data"
```
#### 2. Create a two-hour `DEFAULT` retention policy
InfluxDB writes to the `DEFAULT` retention policy if we do not supply an explicit RP when
writing a point to the database.
We make the `DEFAULT` RP keep data for two hours, because we want InfluxDB to
automatically write the incoming ten-second resolution data to that RP.
Use the
[`CREATE RETENTION POLICY`](/influxdb/v1.8/query_language/database_management/#create-retention-policies-with-create-retention-policy)
statement to create a `DEFAULT` RP:
```sql
> CREATE RETENTION POLICY "two_hours" ON "food_data" DURATION 2h REPLICATION 1 DEFAULT
```
That query creates an RP called `two_hours` that exists in the database
`food_data`.
`two_hours` keeps data for a `DURATION` of two hours (`2h`) and it's the `DEFAULT`
RP for the database `food_data`.
{{% warn %}}
The replication factor (`REPLICATION 1`) is a required parameter but must always
be set to 1 for single node instances.
{{% /warn %}}
> **Note:** When we created the `food_data` database in step 1, InfluxDB
automatically generated an RP named `autogen` and set it as the `DEFAULT`
RP for the database.
The `autogen` RP has an infinite retention period.
With the query above, the RP `two_hours` replaces `autogen` as the `DEFAULT` RP
for the `food_data` database.
#### 3. Create a 52-week retention policy
Next we want to create another retention policy that keeps data for 52 weeks and is not the
`DEFAULT` retention policy (RP) for the database.
Ultimately, the 30-minute rollup data will be stored in this RP.
Use the
[`CREATE RETENTION POLICY`](/influxdb/v1.8/query_language/database_management/#create-retention-policies-with-create-retention-policy)
statement to create a non-`DEFAULT` retention policy:
```sql
> CREATE RETENTION POLICY "a_year" ON "food_data" DURATION 52w REPLICATION 1
```
That query creates a retention policy (RP) called `a_year` that exists in the database
`food_data`.
The `a_year` setting keeps data for a `DURATION` of 52 weeks (`52w`).
Leaving out the `DEFAULT` argument ensures that `a_year` is not the `DEFAULT`
RP for the database `food_data`.
That is, write and read operations against `food_data` that do not specify an
RP will still go to the `two_hours` RP (the `DEFAULT` RP).
#### 4. Create the continuous query
Now that we've set up our RPs, we want to create a continuous query (CQ) that will automatically
and periodically downsample the ten-second resolution data to the 30-minute
resolution, and then store those results in a different measurement with a different
retention policy.
Use the
[`CREATE CONTINUOUS QUERY`](/influxdb/v1.8/query_language/continuous_queries/)
statement to generate a CQ:
```sql
> CREATE CONTINUOUS QUERY "cq_30m" ON "food_data" BEGIN
SELECT mean("website") AS "mean_website",mean("phone") AS "mean_phone"
INTO "a_year"."downsampled_orders"
FROM "orders"
GROUP BY time(30m)
END
```
That query creates a CQ called `cq_30m` in the database `food_data`.
`cq_30m` tells InfluxDB to calculate the 30-minute average of the two fields
`website` and `phone` in the measurement `orders` and in the `DEFAULT` RP
`two_hours`.
It also tells InfluxDB to write those results to the measurement
`downsampled_orders` in the retention policy `a_year` with the field keys
`mean_website` and `mean_phone`.
InfluxDB will run this query every 30 minutes for the previous 30 minutes.
> **Note:** Notice that we fully qualify (that is, we use the syntax
`"<retention_policy>"."<measurement>"`) the measurement in the `INTO`
clause.
InfluxDB requires that syntax to write data to an RP other than the `DEFAULT`
RP.
### Results
With the new CQ and two new RPs, `food_data` is ready to start receiving data.
After writing data to our database and letting things run for a bit, we see
two measurements: `orders` and `downsampled_orders`.
```sql
> SELECT * FROM "orders" LIMIT 5
name: orders
---------
time phone website
2016-05-13T23:00:00Z 10 30
2016-05-13T23:00:10Z 12 39
2016-05-13T23:00:20Z 11 56
2016-05-13T23:00:30Z 8 34
2016-05-13T23:00:40Z 17 32
> SELECT * FROM "a_year"."downsampled_orders" LIMIT 5
name: downsampled_orders
---------------------
time mean_phone mean_website
2016-05-13T15:00:00Z 12 23
2016-05-13T15:30:00Z 13 32
2016-05-13T16:00:00Z 19 21
2016-05-13T16:30:00Z 3 26
2016-05-13T17:00:00Z 4 23
```
The data in `orders` are the raw, ten-second resolution data that reside in the
two-hour RP.
The data in `downsampled_orders` are the aggregated, 30-minute resolution data
that are subject to the 52-week RP.
Notice that the first timestamps in `downsampled_orders` are older than the first
timestamps in `orders`.
This is because InfluxDB has already deleted data from `orders` with timestamps
that are older than our local server's timestamp minus two hours (assume we
executed the `SELECT` queries at `2016-05-14T00:59:59Z`).
InfluxDB will only start dropping data from `downsampled_orders` after 52 weeks.
> **Notes:**
>
* Notice that we fully qualify (that is, we use the syntax
`"<retention_policy>"."<measurement>"`) `downsampled_orders` in
the second `SELECT` statement. We must specify the RP in that query to `SELECT`
data that reside in an RP other than the `DEFAULT` RP.
>
* By default, InfluxDB checks to enforce an RP every 30 minutes.
Between checks, `orders` may have data that are older than two hours.
The rate at which InfluxDB checks to enforce an RP is a configurable setting,
see
[Database Configuration](/influxdb/v1.8/administration/config#check-interval-30m0s).
Using a combination of RPs and CQs, we've successfully set up our database to
automatically keep the high precision raw data for a limited time, create lower
precision data, and store that lower precision data for a longer period of time.
Now that you have a general understanding of how these features can work
together, check out the detailed documentation on [CQs](/influxdb/v1.8/query_language/continuous_queries/) and [RPs](/influxdb/v1.8/query_language/database_management/#retention-policy-management)
to see all that they can do for you.

View File

@ -0,0 +1,474 @@
---
title: Hardware sizing guidelines
menu:
influxdb_1_8:
weight: 40
parent: Guides
---
Review configuration and hardware guidelines for InfluxDB OSS (open source) and InfluxDB Enterprise:
* [Single node or cluster?](#single-node-or-cluster)
* [Query guidelines](#query-guidelines)
* [InfluxDB OSS guidelines](#influxdb-oss-guidelines)
* [InfluxDB Enterprise cluster guidelines](#influxdb-enterprise-cluster-guidelines)
* [When do I need more RAM?](#when-do-i-need-more-ram)
* [Recommended cluster configurations](#recommended-cluster-configurations)
* [Storage: type, amount, and configuration](#storage-type-amount-and-configuration)
> **Disclaimer:** Your numbers may vary from recommended guidelines. Guidelines provide estimated benchmarks for implementing the most performant system for your business.
## Single node or cluster?
If your InfluxDB performance requires any of the following, a single node (InfluxDB OSS) may not support your needs:
- more than 750,000 field writes per second
- more than 100 moderate queries per second ([see Query guides](#query-guidelines))
- more than 10,000,000 [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality)
We recommend InfluxDB Enterprise, which supports multiple data nodes (a cluster) across multiple server cores.
InfluxDB Enterprise distributes multiple copies of your data across a cluster,
providing high-availability and redundancy, so an unavailable node doesnt significantly impact the cluster.
Please [contact us](https://www.influxdata.com/contact-sales/) for assistance tuning your system.
If you want a single node instance of InfluxDB that's fully open source, requires fewer writes, queries, and unique series than listed above, and do **not require** redundancy, we recommend InfluxDB OSS.
> **Note:** Without the redundancy of a cluster, writes and queries fail immediately when a server is unavailable.
## Query guidelines
> Query complexity varies widely on system impact. Recommendations for both single nodes and clusters are based on **moderate** query loads.
For **simple** or **complex** queries, we recommend testing and adjusting the suggested requirements as needed. Query complexity is defined by the following criteria:
| Query complexity | Criteria |
|------------------|---------------------------------------------------------------------------------------|
| Simple | Have few or no functions and no regular expressions |
| | Are bounded in time to a few minutes, hours, or 24 hours at most |
| | Typically execute in a few milliseconds to a few dozen milliseconds |
| Moderate | Have multiple functions and one or two regular expressions |
| | May also have `GROUP BY` clauses or sample a time range of multiple weeks |
| | Typically execute in a few hundred or a few thousand milliseconds |
| Complex | Have multiple aggregation or transformation functions or multiple regular expressions |
| | May sample a very large time range of months or years |
| | Typically take multiple seconds to execute |
## InfluxDB OSS guidelines
Run InfluxDB on locally attached solid state drives (SSDs). Other storage configurations have lower performance and may not be able to recover from small interruptions in normal processing.
Estimated guidelines include writes per second, queries per second, and number of unique [series](/influxdb/v1.8/concepts/glossary/#series), CPU, RAM, and IOPS (input/output operations per second).
| vCPU or CPU | RAM | IOPS | Writes per second | Queries* per second | Unique series |
| ----------: | ------: | -------: | ----------------: | ------------------: | ------------: |
| 2-4 cores | 2-4 GB | 500 | < 5,000 | < 5 | < 100,000 |
| 4-6 cores | 8-32 GB | 500-1000 | < 250,000 | < 25 | < 1,000,000 |
| 8+ cores | 32+ GB | 1000+ | > 250,000 | > 25 | > 1,000,000 |
* **Queries per second for moderate queries.** Queries vary widely in their impact on the system. For simple or complex queries, we recommend testing and adjusting the suggested requirements as needed. See [query guidelines](#query-guidelines) for details.
## InfluxDB Enterprise cluster guidelines
### Meta nodes
> Set up clusters with an odd number of meta nodes──an even number may cause issues in certain configurations.
A cluster must have a **minimum of three** independent meta nodes for data redundancy and availability. A cluster with `2n + 1` meta nodes can tolerate the loss of `n` meta nodes.
Meta nodes do not need very much computing power. Regardless of the cluster load, we recommend the following guidelines for the meta nodes:
* vCPU or CPU: 1-2 cores
* RAM: 512 MB - 1 GB
* IOPS: 50
### Web node
The InfluxDB Enterprise web server is primarily an HTTP server with similar load requirements. For most applications, the server doesn't need to be very robust. A cluster can function with only one web server, but for redundancy, we recommend connecting multiple web servers to a single back-end Postgres database.
> **Note:** Production clusters should not use the SQLite database (lacks support for redundant web servers and handling high loads).
* vCPU or CPU: 2-4 cores
* RAM: 2-4 GB
* IOPS: 100
### Data nodes
A cluster with one data node is valid but has no data redundancy. Redundancy is set by the [replication factor](/influxdb/v1.8/concepts/glossary/#replication-factor) on the retention policy the data is written to. Where `n` is the replication factor, a cluster can lose `n - 1` data nodes and return complete query results.
>**Note:** For optimal data distribution within the cluster, use an even number of data nodes.
Guidelines vary by writes per second per node, moderate queries per second per node, and the number of unique series per node.
#### Guidelines per node
| vCPU or CPU | RAM | IOPS | Writes per second | Queries* per second | Unique series |
| ----------: | -------: | ----: | ----------------: | ------------------: | ------------: |
| 2 cores | 4-8 GB | 1000 | < 5,000 | < 5 | < 100,000 |
| 4-6 cores | 16-32 GB | 1000+ | < 100,000 | < 25 | < 1,000,000 |
| 8+ cores | 32+ GB | 1000+ | > 100,000 | > 25 | > 1,000,000 |
* Guidelines are provided for moderate queries. Queries vary widely in their impact on the system. For simple or complex queries, we recommend testing and adjusting the suggested requirements as needed. See [query guidelines](#query-guidelines) for detail.
## When do I need more RAM?
In general, more RAM helps queries return faster. Your RAM requirements are primarily determined by [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality). Higher cardinality requires more RAM. Regardless of RAM, a series cardinality of 10 million or more can cause OOM (out of memory) failures. You can usually resolve OOM issues by redesigning your [schema](/influxdb/v1.8/concepts/glossary/#schema).
The increase in RAM needs relative to series cardinality is exponential where the exponent is between one and two:
![Series Cardinality](/img/influxdb/series-cardinality.png)
## Guidelines per cluster
InfluxDB Enterprise guidelines vary by writes and queries per second, series cardinality, replication factor, and infrastructure-AWS EC2 R4 instances or equivalent:
- R4.xlarge (4 cores)
- R4.2xlarge (8 cores)
- R4.4xlarge (16 cores)
- R4.8xlarge (32 cores)
> Guidelines stem from a DevOps monitoring use case: maintaining a group of computers and monitoring server metrics (such as CPU, kernel, memory, disk space, disk I/O, network, and so on).
### Recommended cluster configurations
Cluster configurations guidelines are organized by:
- Series cardinality in your data set: 10,000, 100,000, 1,000,000, or 10,000,000
- Number of data nodes
- Number of server cores
For each cluster configuration, you'll find guidelines for the following:
- **maximum writes per second only** (no dashboard queries are running)
- **maximum queries per second only** (no data is being written)
- **maximum simultaneous queries and writes per second, combined**
#### Review cluster configuration tables
1. Select the series cardinality tab below, and then click to expand a replication factor.
2. In the **Nodes x Core** column, find the number of data nodes and server cores in your configuration, and then review the recommended **maximum** guidelines.
{{< tabs-wrapper >}}
{{% tabs %}}
[10,000 series](#)
[100,000 series](#)
[1,000,000 series](#)
[10,000,000 series](#)
{{% /tabs %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 10,000 series:
{{% expand "Replication factor, 1" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 1 x 4 | 188,000 | 5 | 4 + 99,000 |
| 1 x 8 | 405,000 | 9 | 8 + 207,000 |
| 1 x 16 | 673,000 | 15 | 14 + 375,000 |
| 1 x 32 | 1,056,000 | 24 | 22 + 650,000 |
| 2 x 4 | 384,000 | 14 | 14 + 184,000 |
| 2 x 8 | 746,000 | 22 | 22 + 334,000 |
| 2 x 16 | 1,511,000 | 56 | 40 + 878,000 |
| 2 x 32 | 2,426,000 | 96 | 68 + 1,746,000 |
{{% /expand %}}
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 296,000 | 16 | 16 + 151,000 |
| 2 x 8 | 560,000 | 30 | 26 + 290,000 |
| 2 x 16 | 972,000 | 54 | 50 + 456,000 |
| 2 x 32 | 1,860,000 | 84 | 74 + 881,000 |
| 4 x 8 | 1,781,000 | 100 | 64 + 682,000 |
| 4 x 16 | 3,430,000 | 192 | 104 + 1,732,000 |
| 4 x 32 | 6,351,000 | 432 | 188 + 3,283,000 |
| 6 x 8 | 2,923,000 | 216 | 138 + 1,049,000 |
| 6 x 16 | 5,650,000 | 498 | 246 + 2,246,000 |
| 6 x 32 | 9,842,000 | 1248 | 528 + 5,229,000 |
| 8 x 8 | 3,987,000 | 632 | 336 + 1,722,000 |
| 8 x 16 | 7,798,000 | 1384 | 544 + 3,911,000 |
| 8 x 32 | 13,189,000 | 3648 | 1,152 + 7,891,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 815,000 | 63 | 54 + 335,000 |
| 3 x 16 | 1,688,000 | 120 | 87 + 705,000 |
| 3 x 32 | 3,164,000 | 255 | 132 + 1,626,000 |
| 6 x 8 | 2,269,000 | 252 | 168 + 838,000 |
| 6 x 16 | 4,593,000 | 624 | 336 + 2,019,000 |
| 6 x 32 | 7,776,000 | 1340 | 576 + 3,624,000 |
{{% /expand %}}
{{% expand "Replication factor, 4" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| -----------------:| ------------------:|:---------------------------:|
| 4 x 8 | 1,028,000 | 116 | 98 + 365,000 |
| 4 x 16 | 2,067,000 | 208 | 140 + 8,056,000 |
| 4 x 32 | 3,290,000 | 428 | 228 + 1,892,000 |
| 8 x 8 | 2,813,000 | 928 | 496 + 1,225,000 |
| 8 x 16 | 5,225,000 | 2176 | 800 + 2,799,000 |
| 8 x 32 | 8,555,000 | 5184 | 1088 + 6,055,000 |
{{% /expand %}}
{{% expand "Replication factor, 6" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| -----------------:| ------------------:|:---------------------------:|
| 6 x 8 | 1,261,000 | 288 | 192 + 522,000 |
| 6 x 16 | 2,370,000 | 576 | 288 + 1,275,000 |
| 6 x 32 | 3,601,000 | 1056 | 336 + 2,390,000 |
{{% /expand %}}
{{% expand "Replication factor, 8" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| ----------------: | -----------------: |:---------------------------:|
| 8 x 8 | 1,382,000 | 1184 | 416 + 915,000 |
| 8 x 16 | 2,658,000 | 2504 | 448 + 2,204,000 |
| 8 x 32 | 3,887,000 | 5184 | 602 + 4,120,000 |
{{% /expand %}}
{{% /tab-content %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 100,000 series:
{{% expand "Replication factor, 1" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 1 x 4 | 143,000 | 5 | 4 + 77,000 |
| 1 x 8 | 322,000 | 9 | 8 + 167,000 |
| 1 x 16 | 624,000 | 17 | 12 + 337,000 |
| 1 x 32 | 1,114,000 | 26 | 18 + 657,000 |
| 2 x 4 | 265,000 | 14 | 12 + 115,000 |
| 2 x 8 | 573,000 | 30 | 22 + 269,000 |
| 2 x 16 | 1,261,000 | 52 | 38 + 679,000 |
| 2 x 32 | 2,335,000 | 90 | 66 + 1,510,000 |
{{% /expand %}}
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 196,000 | 16 | 14 + 77,000 |
| 2 x 8 | 482,000 | 30 | 24 + 203,000 |
| 2 x 16 | 1,060,000 | 60 | 42 + 415,000 |
| 2 x 32 | 1,958,000 | 94 | 64 + 984,000 |
| 4 x 8 | 1,144,000 | 108 | 68 + 406,000 |
| 4 x 16 | 2,512,000 | 228 | 148 + 866,000 |
| 4 x 32 | 4,346,000 | 564 | 320 + 1,886,000 |
| 6 x 8 | 1,802,000 | 252 | 156 + 618,000 |
| 6 x 16 | 3,924,000 | 562 | 384 + 1,068,000 |
| 6 x 32 | 6,533,000 | 1340 | 912 + 2,083,000 |
| 8 x 8 | 2,516,000 | 712 | 360 + 1,020,000 |
| 8 x 16 | 5,478,000 | 1632 | 1,024 + 1,843,000 |
| 8 x 32 | 1,0527,000 | 3392 | 1,792 + 4,998,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 616,000 | 72 | 51 + 218,000 |
| 3 x 16 | 1,268,000 | 117 | 84 + 438,000 |
| 3 x 32 | 2,260,000 | 189 | 114 + 984,000 |
| 6 x 8 | 1,393,000 | 294 | 192 + 421,000 |
| 6 x 16 | 3,056,000 | 726 | 456 + 893,000 |
| 6 x 32 | 5,017,000 | 1584 | 798 + 1,098,000 |
{{% /expand %}}
{{% expand "Replication factor, 4" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| -----------------:| ------------------:|:---------------------------:|
| 4 x 8 | 635,000 | 112 | 80 + 207,000 |
| 4 x 16 | 1,359,000 | 188 | 124 + 461,000 |
| 4 x 32 | 2,320,000 | 416 | 192 + 1,102,000 |
| 8 x 8 | 1,570,000 | 1360 | 816 + 572,000 |
| 8 x 16 | 3,205,000 | 2720 | 832 + 2,053,000 |
| 8 x 32 | 3,294,000 | 2592 | 804 + 2,174,000 |
{{% /expand %}}
{{% expand "Replication factor, 6" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| -----------------:| ------------------:|:---------------------------:|
| 6 x 8 | 694,000 | 302 | 198 + 204,000 |
| 6 x 16 | 1,397,000 | 552 | 360 + 450,000 |
| 6 x 32 | 2,298,000 | 1248 | 384 + 1,261,000 |
{{% /expand %}}
{{% expand "Replication factor, 8" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| ----------------: | -----------------: |:---------------------------:|
| 8 x 8 | 739,000 | 1296 | 480 + 371,000 |
| 8 x 16 | 1,396,000 | 2592 | 672 + 843,000 |
| 8 x 32 | 2,614,000 | 2496 | 960 + 1,371,000 |
{{% /expand %}}
{{% /tab-content %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 1,000,000 series:
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:-------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 104,000 | 18 | 12 + 54,000 |
| 2 x 8 | 195,000 | 36 | 24 + 99,000 |
| 2 x 16 | 498,000 | 70 | 44 + 145,000 |
| 2 x 32 | 1,195,000 | 102 | 84 + 232,000 |
| 4 x 8 | 488,000 | 120 | 56 + 222,000 |
| 4 x 16 | 1,023,000 | 244 | 112 + 428,000 |
| 4 x 32 | 2,686,000 | 468 | 208 + 729,000 |
| 6 x 8 | 845,000 | 270 | 126 + 356,000 |
| 6 x 16 | 1,780,000 | 606 | 288 + 663,000 |
| 6 x 32 | 430,000 | 1,488 | 624 + 1,209,000 |
| 8 x 8 | 1,831,000 | 808 | 296 + 778,000 |
| 8 x 16 | 4,167,000 | 1,856 | 640 + 2,031,000 |
| 8 x 32 | 7,813,000 | 3,201 | 896 + 4,897,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 234,000 | 72 | 42 + 87,000 |
| 3 x 16 | 613,000 | 120 | 75 + 166,000 |
| 3 x 32 | 1,365,000 | 141 | 114 + 984,000 |
| 6 x 8 | 593,000 | 318 | 144 + 288,000 |
| 6 x 16 | 1,545,000 | 744 | 384 + 407,000 |
| 6 x 32 | 3,204,000 | 1632 | 912 + 505,000 |
{{% /expand %}}
{{% expand "Replication factor, 4" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| -----------------:| ------------------:|:---------------------------:|
| 4 x 8 | 258,000 | 116 | 68 + 73,000 |
| 4 x 16 | 675,000 | 196 | 132 + 140,000 |
| 4 x 32 | 1,513,000 | 244 | 176 + 476,000 |
| 8 x 8 | 614,000 | 1096 | 400 + 258,000 |
| 8 x 16 | 1,557,000 | 2496 | 1152 + 436,000 |
| 8 x 32 | 3,265,000 | 4288 | 2240 + 820,000 |
{{% /expand %}}
{{% expand "Replication factor, 6" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| -----------------:| ------------------:|:---------------------------:|
| 6 x 8 | 694,000 | 302 | 198 + 204,000 |
| 6 x 16 | 1,397,000 | 552 | 360 + 450,000 |
| 6 x 32 | 2,298,000 | 1248 | 384 + 1,261,000 |
{{% /expand %}}
{{% expand "Replication factor, 8" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| ----------------: | -----------------: |:---------------------------:|
| 8 x 8 | 739,000 | 1296 | 480 + 371,000 |
| 8 x 16 | 1,396,000 | 2592 | 672 + 843,000 |
| 8 x 32 | 2,614,000 | 2496 | 960 + 1,371,000 |
{{% /expand %}}
{{% /tab-content %}}
{{% tab-content %}}
Select one of the following replication factors to see the recommended cluster configuration for 10,000,000 series:
{{% expand "Replication factor, 1" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 122,000 | 16 | 12 + 81,000 |
| 2 x 8 | 259,000 | 36 | 24 + 143,000 |
| 2 x 16 | 501,000 | 66 | 44 + 290,000 |
| 2 x 32 | 646,000 | 142 | 54 + 400,000 |
{{% /expand %}}
{{% expand "Replication factor, 2" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 2 x 4 | 87,000 | 18 | 14 + 56,000 |
| 2 x 8 | 169,000 | 38 | 24 + 98,000 |
| 2 x 16 | 334,000 | 76 | 46 + 224,000 |
| 2 x 32 | 534,000 | 136 | 58 + 388,000 |
| 4 x 8 | 335,000 | 120 | 60 + 204,000 |
| 4 x 16 | 643,000 | 256 | 112 + 395,000 |
| 4 x 32 | 967,000 | 560 | 158 + 806,000 |
| 6 x 8 | 521,000 | 378 | 144 + 319,000 |
| 6 x 16 | 890,000 | 582 | 186 + 513,000 |
| 8 x 8 | 699,000 | 1,032 | 256 + 477,000 |
| 8 x 16 | 1,345,000 | 2,048 | 544 + 741,000 |
{{% /expand %}}
{{% expand "Replication factor, 3" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:|------------------:|-------------------:|:---------------------------:|
| 3 x 8 | 170,000 | 60 | 42 + 98,000 |
| 3 x 16 | 333,000 | 129 | 76 + 206,000 |
| 3 x 32 | 609,000 | 178 | 60 + 162,000 |
| 6 x 8 | 395,000 | 402 | 132 + 247,000 |
| 6 x 16 | 679,000 | 894 | 150 + 527,000 |
{{% /expand %}}
{{% expand "Replication factor, 4" %}}
| Nodes x Core | Writes per second | Queries per second | Queries + writes per second |
|:------------:| -----------------:| ------------------:|:---------------------------:|
| 4 x 8 | 183365 | 132 | 52 + 100,000 |
{{% /expand %}}
{{% /tab-content %}}
{{< /tabs-wrapper >}}
## Storage: type, amount, and configuration
### Storage volume and IOPS
Consider the type of storage you need and the amount. InfluxDB is designed to run on solid state drives (SSDs) and memory-optimized cloud instances, for example, AWS EC2 R5 or R4 instances. InfluxDB isn't tested on hard disk drives (HDDs) and we don't recommend HDDs for production. For best results, InfluxDB servers must have a minimum of 1000 IOPS on storage to ensure recovery and availability. We recommend at least 2000 IOPS for rapid recovery of cluster data nodes after downtime.
See your cloud provider documentation for IOPS detail on your storage volumes.
### Bytes and compression
Database names, [measurements](/influxdb/v1.8/concepts/glossary/#measurement), [tag keys](/influxdb/v1.8/concepts/glossary/#tag-key), [field keys](/influxdb/v1.8/concepts/glossary/#field-key), and [tag values](/influxdb/v1.8/concepts/glossary/#tag-value) are stored only once and always as strings. [Field values](/influxdb/v1.8/concepts/glossary/#field-value) and [timestamps](/influxdb/v1.8/concepts/glossary/#timestamp) are stored for every point.
Non-string values require approximately three bytes. String values require variable space, determined by string compression.
### Separate `wal` and `data` directories
When running InfluxDB in a production environment, store the `wal` directory and the `data` directory on separate storage devices. This optimization significantly reduces disk contention under heavy write load──an important consideration if the write load is highly variable. If the write load does not vary by more than 15%, the optimization is probably not necessary.

View File

@ -0,0 +1,11 @@
---
title: Migrate from InfluxDB OSS to InfluxDB Enterprise
description: >
Migrate your InfluxDB OSS instance with your data and users to InfluxDB Enterprise.
menu:
influxdb_1_8:
weight: 50
parent: Guides
name: Migrate to InfluxDB Enterprise
url: /enterprise_influxdb/v1.8/guides/migration/
---

View File

@ -0,0 +1,109 @@
---
title: Query data with the InfluxDB API
alias:
-/docs/v1.8/query_language/querying_data/
menu:
influxdb_1_8:
weight: 20
parent: Guides
aliases:
- /influxdb/v1.8/guides/querying_data/
---
The InfluxDB API is the primary means for querying data in InfluxDB (see the [command line interface](/influxdb/v1.8/tools/shell/) and [client libraries](/influxdb/v1.8/tools/api_client_libraries/) for alternative ways to query the database).
Query data with the InfluxDB API using [Flux](#query-data-with-flux) or [InfluxQL](#query-data-with-influxql).
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
## Query data with Flux
For Flux queries, the `/api/v2/query` endpoint accepts `POST` HTTP requests. Use the following HTTP headers:
- `Accept: application/csv`
- `Content-type: application/vnd.flux`
If you have authentication enabled, provide your InfluxDB username and password with the `Authorization` header and `Token` schema. For example: `Authorization: Token username:password`.
The following example queries Telegraf data using Flux:
:
```bash
$ curl -XPOST localhost:8086/api/v2/query -sS \
-H 'Accept:application/csv' \
-H 'Content-type:application/vnd.flux' \
-d 'from(bucket:"telegraf")
|> range(start:-5m)
|> filter(fn:(r) => r._measurement == "cpu")'
```
Flux returns [annotated CSV](https://v2.docs.influxdata.com/v2.0/reference/syntax/annotated-csv/):
```
{,result,table,_start,_stop,_time,_value,_field,_measurement,cpu,host
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:19Z,4.152553004641827,usage_user,cpu,cpu-total,host1
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:29Z,7.608695652173913,usage_user,cpu,cpu-total,host1
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:39Z,2.9363988504310883,usage_user,cpu,cpu-total,host1
,_result,0,2020-04-07T18:02:54.924273Z,2020-04-07T19:02:54.924273Z,2020-04-07T18:08:49Z,6.915093159934975,usage_user,cpu,cpu-total,host1}
```
The header row defines column labels for the table. The `cpu` [measurement](/influxdb/v1.8/concepts/glossary/#measurement) has four points, each represented by one of the record rows. For example the first point has a [timestamp](/influxdb/v1.8/concepts/glossary/#timestamp) of `2020-04-07T18:08:19`.
### Flux
Check out the [Get started with Flux](https://v2.docs.influxdata.com/v2.0/query-data/get-started/) to learn more about building queries with Flux.
For more information about querying data with the InfluxDB API using Flux, see the [API reference documentation](/influxdb/v1.8/tools/api/#influxdb-2-0-api-compatibility-endpoints).
## Query data with InfluxQL
To perform an InfluxQL query, send a `GET` request to the `/query` endpoint, set the URL parameter `db` as the target database, and set the URL parameter `q` as your query.
You can also use a `POST` request by sending the same parameters either as URL parameters or as part of the body with `application/x-www-form-urlencoded`.
The example below uses the InfluxDB API to query the same database that you encountered in [Writing Data](/influxdb/v1.8/guides/writing_data/).
```bash
curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM \"cpu_load_short\" WHERE \"region\"='us-west'"
```
InfluxDB returns JSON:
```json
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "cpu_load_short",
"columns": [
"time",
"value"
],
"values": [
[
"2015-01-29T21:55:43.702900257Z",
2
],
[
"2015-01-29T21:55:43.702900257Z",
0.55
],
[
"2015-06-11T20:46:02Z",
0.64
]
]
}
]
}
]
}
```
> **Note:** Appending `pretty=true` to the URL enables pretty-printed JSON output.
While this is useful for debugging or when querying directly with tools like `curl`, it is not recommended for production use as it consumes unnecessary network bandwidth.
### InfluxQL
Check out the [Data Exploration page](/influxdb/v1.8/query_language/data_exploration/) to get acquainted with InfluxQL.
For more information about querying data with the InfluxDB API using InfluxQL, see the [API reference documentation](/influxdb/v1.8/tools/api/#influxdb-1-x-http-endpoints).

View File

@ -0,0 +1,188 @@
---
title: Write data with the InfluxDB API
menu:
influxdb_1_8:
weight: 10
parent: Guides
aliases:
- /influxdb/v1.8/guides/writing_data/
---
Write data into InfluxDB using the [command line interface](/influxdb/v1.8/tools/shell/), [client libraries](/influxdb/v1.8/clients/api/), and plugins for common data formats such as [Graphite](/influxdb/v1.8/write_protocols/graphite/).
> **Note**: The following examples use `curl`, a command line tool that transfers data using URLs. Learn the basics of `curl` with the [HTTP Scripting Guide](https://curl.haxx.se/docs/httpscripting.html).
### Create a database using the InfluxDB API
To create a database send a `POST` request to the `/query` endpoint and set the URL parameter `q` to `CREATE DATABASE <new_database_name>`.
The example below sends a request to InfluxDB running on `localhost` and creates the `mydb` database:
```bash
curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE mydb"
```
### Write data using the InfluxDB API
The InfluxDB API is the primary means of writing data into InfluxDB.
- To **write to a database using the InfluxDB 1.8 API**, send `POST` requests to the `/write` endpoint. For example, to write a single point to the `mydb` database.
The data consists of the [measurement](/influxdb/v1.8/concepts/glossary/#measurement) `cpu_load_short`, the [tag keys](/influxdb/v1.8/concepts/glossary/#tag-key) `host` and `region` with the [tag values](/influxdb/v1.8/concepts/glossary/#tag-value) `server01` and `us-west`, the [field key](/influxdb/v1.8/concepts/glossary/#field-key) `value` with a [field value](/influxdb/v1.8/concepts/glossary/#field-value) of `0.64`, and the [timestamp](/influxdb/v1.8/concepts/glossary/#timestamp) `1434055562000000000`.
```bash
curl -i -XPOST 'http://localhost:8086/write?db=mydb'
--data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
```
- To **write to a database using the InfluxDB 2.0 API (compatible with InfluxDB 1.8+)**, send `POST` requests to the [`/api/v2/write` endpoint](/influxdb/v1.8/tools/api/#client-libraries):
```bash
curl -i -XPOST 'http://localhost:8086/api/v2/write?bucket=db/rp&precision=ns' \
--header 'Authorization: Token username:password' \
--data-raw 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'
```
When writing points, you must specify an existing database in the `db` query parameter.
Points will be written to `db`'s default retention policy if you do not supply a retention policy via the `rp` query parameter.
See the [InfluxDB API Reference](/influxdb/v1.8/tools/api/#write-http-endpoint) documentation for a complete list of the available query parameters.
The body of the POST or [InfluxDB line protocol](/influxdb/v1.8/concepts/glossary/#influxdb-line-protocol) contains the time series data that you want to store. Data includes:
- **Measurement (required)**
- **Tags**: Strictly speaking, tags are optional but most series include tags to differentiate data sources and to make querying both easy and efficient.
Both tag keys and tag values are strings.
- **Fields (required)**: Field keys are required and are always strings, and, [by default](/influxdb/v1.8/write_protocols/line_protocol_reference/#data-types), field values are floats.
- **Timestamp**: Supplied at the end of the line in Unix time in nanoseconds since January 1, 1970 UTC - is optional. If you do not specify a timestamp, InfluxDB uses the server's local nanosecond timestamp in Unix epoch.
Time in InfluxDB is in UTC format by default.
> **Note:** Avoid using the following reserved keys: `_field`, `_measurement`, and `time`. If reserved keys are included as a tag or field key, the associated point is discarded.
### Configure gzip compression
InfluxDB supports gzip compression. To reduce network traffic, consider the following options:
* To accept compressed data from InfluxDB, add the `Accept-Encoding: gzip` header to InfluxDB API requests.
* To compress data before sending it to InfluxDB, add the `Content-Encoding: gzip` header to InfluxDB API requests.
For details about enabling gzip for client libraries, see your client library documentation.
#### Enable gzip compression in the Telegraf InfluxDB output plugin
* In the Telegraf configuration file (telegraf.conf), under [[outputs.influxdb]], change
`content_encoding = "identity"` (default) to `content_encoding = "gzip"`
>**Note**
Writes to InfluxDB 2.x [[outputs.influxdb_v2]] are configured to compress content in gzip format by default.
### Writing multiple points
Post multiple points to multiple series at the same time by separating each point with a new line.
Batching points in this manner results in much higher performance.
The following example writes three points to the database `mydb`.
The first point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02` and has the server's local timestamp.
The second point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02,region=us-west` and has the specified timestamp `1422568543702900257`.
The third point has the same specified timestamp as the second point, but it is written to the series with the measurement `cpu_load_short` and tag set `direction=in,host=server01,region=us-west`.
```bash
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257'
```
### Writing points from a file
Write points from a file by passing `@filename` to `curl`.
The data in the file should follow the [InfluxDB line protocol syntax](/influxdb/v1.8/write_protocols/write_syntax/).
Example of a properly-formatted file (`cpu_data.txt`):
```txt
cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257
```
Write the data in `cpu_data.txt` to the `mydb` database with:
```bash
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary @cpu_data.txt`
```
> **Note:** If your data file has more than 5,000 points, it may be necessary to split that file into several files in order to write your data in batches to InfluxDB.
By default, the HTTP request times out after five seconds.
InfluxDB will still attempt to write the points after that time out but there will be no confirmation that they were successfully written.
### Schemaless Design
InfluxDB is a schemaless database.
You can add new measurements, tags, and fields at any time.
Note that if you attempt to write data with a different type than previously used (for example, writing a string to a field that previously accepted integers), InfluxDB will reject those data.
### A note on REST
InfluxDB uses HTTP solely as a convenient and widely supported data transfer protocol.
Modern web APIs have settled on REST because it addresses a common need.
As the number of endpoints grows the need for an organizing system becomes pressing.
REST is the industry agreed style for organizing large numbers of endpoints.
This consistency is good for those developing and consuming the API: everyone involved knows what to expect.
REST, however, is a convention.
InfluxDB makes do with three API endpoints.
This simple, easy to understand system uses HTTP as a transfer method for [InfluxQL](/influxdb/v1.8/query_language/spec/).
The InfluxDB API makes no attempt to be RESTful.
### HTTP response summary
* 2xx: If your write request received `HTTP 204 No Content`, it was a success!
* 4xx: InfluxDB could not understand the request.
* 5xx: The system is overloaded or significantly impaired.
#### Examples
##### Writing a float to a field that previously accepted booleans
```bash
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=true'
curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=5'
```
returns:
```bash
HTTP/1.1 400 Bad Request
Content-Type: application/json
Request-Id: [...]
X-Influxdb-Version: 1.4.x
Date: Wed, 01 Mar 2017 19:38:01 GMT
Content-Length: 150
{"error":"field type conflict: input field \"booleanonly\" on measurement \"tobeornottobe\" is type float, already exists as type boolean dropped=1"}
```
##### Writing a point to a database that doesn't exist
```bash
curl -i -XPOST 'http://localhost:8086/write?db=atlantis' --data-binary 'liters value=10'
```
returns:
```bash
HTTP/1.1 404 Not Found
Content-Type: application/json
Request-Id: [...]
X-Influxdb-Version: 1.4.x
Date: Wed, 01 Mar 2017 19:38:35 GMT
Content-Length: 45
{"error":"database not found: \"atlantis\""}
```
### Next steps
Now that you know how to write data with the InfluxDB API, discover how to query them with the [Querying data](/influxdb/v1.8/guides/querying_data/) guide!
For more information about writing data with the InfluxDB API, please see the [InfluxDB API reference](/influxdb/v1.8/tools/api/#write-http-endpoint).

View File

@ -0,0 +1,15 @@
---
title: High availability with InfluxDB Enterprise
menu:
influxdb_1_8:
name: High availability
weight: 100
---
## [Clustering with InfluxDB Enterprise](/influxdb/v1.8/high_availability/clusters/)
InfluxDB OSS does not support clustering.
For high availability or horizontal scaling of InfluxDB, consider the InfluxData
commercial clustered offering,
[InfluxDB Enterprise](https://portal.influxdata.com/).

View File

@ -0,0 +1,18 @@
---
title: Create an InfluxDB Enterprise cluster
aliases:
- /influxdb/v1.8/clustering/
- /influxdb/v1.8/clustering/cluster_setup/
- /influxdb/v1.8/clustering/cluster_node_config/
- /influxdb/v1.8/guides/clustering/
menu:
influxdb_1_8:
name: Create a cluster
weight: 10
parent: High availability
---
InfluxDB OSS does not support clustering.
For high availability or horizontal scaling of InfluxDB, consider the InfluxData
commercial clustered offering,
[InfluxDB Enterprise](/enterprise_influxdb/latest/).

View File

@ -0,0 +1,21 @@
---
title: Learn about InfluxDB OSS
menu:
influxdb_1_8:
name: Introduction
weight: 20
---
To get up and running with the open source (OSS) version of InfluxDB, complete the following tasks:
## [Download InfluxDB OSS](https://portal.influxdata.com/downloads)
Find the latest stable download and nightly builds of InfluxDB.
## [Install InfluxDB OSS](/influxdb/v1.8/introduction/installation/)
Learn how to install InfluxDB on Ubuntu, Debian, Red Hat, CentOS, and macOS.
## [Get started with InfluxDB OSS](/influxdb/v1.8/introduction/getting-started/)
Discover how to read and write time series data using InfluxDB.

View File

@ -0,0 +1,12 @@
---
title: Download InfluxDB OSS
menu:
influxdb_1_8:
name: Download InfluxDB
weight: 10
parent: Introduction
aliases:
- /influxdb/v1.8/introduction/downloading/
---
Download the latest InfluxDB open source (OSS) release at the [InfluxData download page](https://portal.influxdata.com/downloads).

View File

@ -0,0 +1,198 @@
---
title: Get started with InfluxDB OSS
aliases:
- /influxdb/v1.8/introduction/getting_started/
- /influxdb/v1.8/introduction/getting-started/
menu:
influxdb_1_8:
name: Get started with InfluxDB
weight: 30
parent: Introduction
---
With InfluxDB open source (OSS) [installed](/influxdb/v1.8/introduction/installation), you're ready to start doing some awesome things.
In this section we'll use the `influx` [command line interface](/influxdb/v1.8/tools/shell/) (CLI), which is included in all
InfluxDB packages and is a lightweight and simple way to interact with the database.
The CLI communicates with InfluxDB directly by making requests to the InfluxDB API over port `8086` by default.
> **Note:** The database can also be used by making raw HTTP requests.
See [Writing Data](/influxdb/v1.8/guides/writing_data/) and [Querying Data](/influxdb/v1.8/guides/querying_data/)
for examples with the `curl` application.
## Creating a database
If you've installed InfluxDB locally, the `influx` command should be available via the command line.
Executing `influx` will start the CLI and automatically connect to the local InfluxDB instance
(assuming you have already started the server with `service influxdb start` or by running `influxd` directly).
The output should look like this:
```bash
$ influx -precision rfc3339
Connected to http://localhost:8086 version 1.8.x
InfluxDB shell 1.8.x
>
```
> **Notes:**
>
* The InfluxDB API runs on port `8086` by default.
Therefore, `influx` will connect to port `8086` and `localhost` by default.
If you need to alter these defaults, run `influx --help`.
* The [`-precision` argument](/influxdb/v1.8/tools/shell/#influx-options) specifies the format/precision of any returned timestamps.
In the example above, `rfc3339` tells InfluxDB to return timestamps in [RFC3339 format](https://www.ietf.org/rfc/rfc3339.txt) (`YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ`).
The command line is now ready to take input in the form of the Influx Query Language (a.k.a InfluxQL) statements.
To exit the InfluxQL shell, type `exit` and hit return.
A fresh install of InfluxDB has no databases (apart from the system `_internal`),
so creating one is our first task.
You can create a database with the `CREATE DATABASE <db-name>` InfluxQL statement,
where `<db-name>` is the name of the database you wish to create.
Names of databases can contain any unicode character as long as the string is double-quoted.
Names can also be left unquoted if they contain _only_ ASCII letters,
digits, or underscores and do not begin with a digit.
Throughout this guide, we'll use the database name `mydb`:
```sql
> CREATE DATABASE mydb
>
```
> **Note:** After hitting enter, a new prompt appears and nothing else is displayed.
In the CLI, this means the statement was executed and there were no errors to display.
There will always be an error displayed if something went wrong.
No news is good news!
Now that the `mydb` database is created, we'll use the `SHOW DATABASES` statement
to display all existing databases:
```sql
> SHOW DATABASES
name: databases
name
----
_internal
mydb
>
```
> **Note:** The `_internal` database is created and used by InfluxDB to store internal runtime metrics.
Check it out later to get an interesting look at how InfluxDB is performing under the hood.
Unlike `SHOW DATABASES`, most InfluxQL statements must operate against a specific database.
You may explicitly name the database with each query,
but the CLI provides a convenience statement, `USE <db-name>`,
which will automatically set the database for all future requests. For example:
```sql
> USE mydb
Using database mydb
>
```
Now future commands will only be run against the `mydb` database.
## Writing and exploring data
Now that we have a database, InfluxDB is ready to accept queries and writes.
First, a short primer on the datastore.
Data in InfluxDB is organized by "time series",
which contain a measured value, like "cpu_load" or "temperature".
Time series have zero to many `points`, one for each discrete sample of the metric.
Points consist of `time` (a timestamp), a `measurement` ("cpu_load", for example),
at least one key-value `field` (the measured value itself, e.g.
"value=0.64", or "temperature=21.2"), and zero to many key-value `tags` containing any metadata about the value (e.g.
"host=server01", "region=EMEA", "dc=Frankfurt").
Conceptually you can think of a `measurement` as an SQL table,
where the primary index is always time.
`tags` and `fields` are effectively columns in the table.
`tags` are indexed, and `fields` are not.
The difference is that, with InfluxDB, you can have millions of measurements,
you don't have to define schemas up-front, and null values aren't stored.
Points are written to InfluxDB using the InfluxDB line protocol, which follows the following format:
```
<measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]
```
The following lines are all examples of points that can be written to InfluxDB:
```
cpu,host=serverA,region=us_west value=0.64
payment,device=mobile,product=Notepad,method=credit billed=33,licenses=3i 1434067467100293230
stock,symbol=AAPL bid=127.46,ask=127.48
temperature,machine=unit42,type=assembly external=25,internal=37 1434067467000000000
```
> **Note:** For details on the InfluxDB line protocol, see [InfluxDB line protocol syntax](/influxdb/v1.8/write_protocols/line_protocol_reference/#line-protocol-syntax) page.
To insert a single time series data point into InfluxDB using the CLI, enter `INSERT` followed by a point:
```sql
> INSERT cpu,host=serverA,region=us_west value=0.64
>
```
A point with the measurement name of `cpu` and tags `host` and `region` has now been written to the database, with the measured `value` of `0.64`.
Now we will query for the data we just wrote:
```sql
> SELECT "host", "region", "value" FROM "cpu"
name: cpu
---------
time host region value
2015-10-21T19:28:07.580664347Z serverA us_west 0.64
>
```
> **Note:** We did not supply a timestamp when writing our point.
When no timestamp is supplied for a point, InfluxDB assigns the local current timestamp when the point is ingested.
That means your timestamp will be different.
Let's try storing another type of data, with two fields in the same measurement:
```sql
> INSERT temperature,machine=unit42,type=assembly external=25,internal=37
>
```
To return all fields and tags with a query, you can use the `*` operator:
```sql
> SELECT * FROM "temperature"
name: temperature
-----------------
time external internal machine type
2015-10-21T19:28:08.385013942Z 25 37 unit42 assembly
>
```
> **Warning:** Using `*` without a `LIMIT` clause on a large database can cause performance issues.
You can use `Ctrl+C` to cancel a query that is taking too long to respond.
InfluxQL has many [features and keywords](/influxdb/v1.8/query_language/spec/) that are not covered here,
including support for Go-style regex. For example:
```sql
> SELECT * FROM /.*/ LIMIT 1
--
> SELECT * FROM "cpu_load_short"
--
> SELECT * FROM "cpu_load_short" WHERE "value" > 0.9
```
This is all you need to know to write data into InfluxDB and query it back.
To learn more about the InfluxDB write protocol,
check out the guide on [Writing Data](/influxdb/v1.8/guides/writing_data/).
To further explore the query language,
check out the guide on [Querying Data](/influxdb/v1.8/guides/querying_data/).
For more information on InfluxDB concepts, check out the [Key Concepts]
(/influxdb/v1.8/concepts/key_concepts/) page.

View File

@ -0,0 +1,384 @@
---
title: Install InfluxDB OSS
menu:
influxdb_1_8:
name: Install InfluxDB
weight: 20
parent: Introduction
aliases:
- /influxdb/v1.8/introduction/installation/
---
This page provides directions for installing, starting, and configuring InfluxDB open source (OSS).
## InfluxDB OSS installation requirements
Installation of the InfluxDB package may require `root` or administrator privileges in order to complete successfully.
### InfluxDB OSS networking ports
By default, InfluxDB uses the following network ports:
- TCP port `8086` is available for client-server communication using the InfluxDB API.
- TCP port `8088` is available for the RPC service to perform back up and restore operations.
In addition to the ports above, InfluxDB also offers multiple plugins that may
require [custom ports](/influxdb/v1.8/administration/ports/).
All port mappings can be modified through the [configuration file](/influxdb/v1.8/administration/config),
which is located at `/etc/influxdb/influxdb.conf` for default installations.
### Network Time Protocol (NTP)
InfluxDB uses a host's local time in UTC to assign timestamps to data and for
coordination purposes.
Use the Network Time Protocol (NTP) to synchronize time between hosts; if hosts'
clocks aren't synchronized with NTP, the timestamps on the data written to InfluxDB
can be inaccurate.
## Installing InfluxDB OSS
For users who don't want to install any software and are ready to use InfluxDB,
you may want to check out our
[managed hosted InfluxDB offering](https://cloud.influxdata.com).
{{< tabs-wrapper >}}
{{% tabs %}}
[Ubuntu & Debian](#)
[Red Hat & CentOS](#)
[SLES & openSUSE](#)
[FreeBSD/PC-BSD](#)
[macOS](#)
{{% /tabs %}}
{{% tab-content %}}
For instructions on how to install the Debian package from a file,
please see the
[downloads page](https://influxdata.com/downloads/).
Debian and Ubuntu users can install the latest stable version of InfluxDB using the
`apt-get` package manager.
For Ubuntu users, add the InfluxData repository with the following commands:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[wget](#)
[curl](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/lsb-release
echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/lsb-release
echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
For Debian users, add the InfluxData repository:
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[wget](#)
[curl](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
```bash
wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/os-release
echo "deb https://repos.influxdata.com/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
```
{{% /code-tab-content %}}
{{% code-tab-content %}}
```bash
curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/os-release
echo "deb https://repos.influxdata.com/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
Then, install and start the InfluxDB service:
```bash
sudo apt-get update && sudo apt-get install influxdb
sudo service influxdb start
```
Or if your operating system is using systemd (Ubuntu 15.04+, Debian 8+):
```bash
sudo apt-get update && sudo apt-get install influxdb
sudo systemctl unmask influxdb.service
sudo systemctl start influxdb
```
{{% /tab-content %}}
{{% tab-content %}}
For instructions on how to install the RPM package from a file, please see the [downloads page](https://influxdata.com/downloads/).
Red Hat and CentOS users can install the latest stable version of InfluxDB using the `yum` package manager:
```bash
cat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo
[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key
EOF
```
Once repository is added to the `yum` configuration, install and start the InfluxDB service by running:
```bash
sudo yum install influxdb
sudo service influxdb start
```
Or if your operating system is using systemd (CentOS 7+, RHEL 7+):
```bash
sudo yum install influxdb
sudo systemctl start influxdb
```
{{% /tab-content %}}
{{% tab-content %}}
There are RPM packages provided by openSUSE Build Service for SUSE Linux users:
```bash
# add go repository
zypper ar -f obs://devel:languages:go/ go
# install latest influxdb
zypper in influxdb
```
{{% /tab-content %}}
{{% tab-content %}}
InfluxDB is part of the FreeBSD package system.
It can be installed by running:
```bash
sudo pkg install influxdb
```
The configuration file is located at `/usr/local/etc/influxd.conf` with examples in `/usr/local/etc/influxd.conf.sample`.
Start the backend by executing:
```bash
sudo service influxd onestart
```
To have InfluxDB start at system boot, add `influxd_enable="YES"` to `/etc/rc.conf`.
{{% /tab-content %}}
{{% tab-content %}}
Users of macOS 10.8 and higher can install InfluxDB using the [Homebrew](http://brew.sh/) package manager.
Once `brew` is installed, you can install InfluxDB by running:
```bash
brew update
brew install influxdb
```
To have `launchd` start InfluxDB at login, run:
```bash
ln -sfv /usr/local/opt/influxdb/*.plist ~/Library/LaunchAgents
```
And then to start InfluxDB now, run:
```bash
launchctl load ~/Library/LaunchAgents/homebrew.mxcl.influxdb.plist
```
Or, if you don't want/need launchctl, in a separate terminal window you can just run:
```bash
influxd -config /usr/local/etc/influxdb.conf
```
{{% /tab-content %}}
{{< /tabs-wrapper >}}
### Verify the authenticity of downloaded binary (optional)
For added security, follow these steps to verify the signature of your InfluxDB download with `gpg`.
(Most operating systems include the `gpg` command by default.
If `gpg` is not available, see the [GnuPG homepage](https://gnupg.org/download/) for installation instructions.)
1. Download and import InfluxData's public key:
```
curl -sL https://repos.influxdata.com/influxdb.key | gpg --import
```
2. Download the signature file for the release by adding `.asc` to the download URL.
For example:
```
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.8.0_linux_amd64.tar.gz.asc
```
3. Verify the signature with `gpg --verify`:
```
gpg --verify influxdb-1.8.0_linux_amd64.tar.gz.asc influxdb-1.8.0_linux_amd64.tar.gz
```
The output from this command should include the following:
```
gpg: Good signature from "InfluxDB Packaging Service <support@influxdb.com>" [unknown]
```
## Configuring InfluxDB OSS
The system has internal defaults for every configuration file setting.
View the default configuration settings with the `influxd config` command.
> **Note:** If InfluxDB is being deployed on a publicly accessible endpoint, we strongly recommend authentication be enabled.
Otherwise the data will be publicly available to any unauthenticated user. The default settings do **NOT** enable
authentication and authorization. Further, authentication and authorization should not be solely relied upon to prevent access
and protect data from malicious actors. If additional security or compliance features are desired, InfluxDB should be run
behind a third-party service. Review the [authentication and authorization](/influxdb/v1.8/administration/authentication_and_authorization/)
settings.
Most of the settings in the local configuration file
(`/etc/influxdb/influxdb.conf`) are commented out; all
commented-out settings will be determined by the internal defaults.
Any uncommented settings in the local configuration file override the
internal defaults.
Note that the local configuration file does not need to include every
configuration setting.
There are two ways to launch InfluxDB with your configuration file:
* Point the process to the correct configuration file by using the `-config`
option:
```bash
influxd -config /etc/influxdb/influxdb.conf
```
* Set the environment variable `INFLUXDB_CONFIG_PATH` to the path of your
configuration file and start the process.
For example:
```
echo $INFLUXDB_CONFIG_PATH
/etc/influxdb/influxdb.conf
influxd
```
InfluxDB first checks for the `-config` option and then for the environment
variable.
See the [Configuration](/influxdb/v1.8/administration/config/) documentation for more information.
### Data and WAL directory permissions
Make sure the directories in which data and the [write ahead log](/influxdb/v1.8/concepts/glossary#wal-write-ahead-log) (WAL) are stored are writable for the user running the `influxd` service.
> **Note:** If the data and WAL directories are not writable, the `influxd` service will not start.
Information about `data` and `wal` directory paths is available in the [Data settings](/influxdb/v1.8/administration/config/#data-settings) section of the [Configuring InfluxDB](/influxdb/v1.8/administration/config/) documentation.
## Hosting InfluxDB OSS on AWS
### Hardware requirements for InfluxDB
We recommend using two SSD volumes, using one for the `influxdb/wal` and the other for the `influxdb/data`.
Depending on your load, each volume should have around 1k-3k provisioned IOPS.
The `influxdb/data` volume should have more disk space with lower IOPS and the `influxdb/wal` volume should have less disk space with higher IOPS.
Each machine should have a minimum of 8GB RAM.
Weve seen the best performance with the R4 class of machines, as they provide more memory than either of the C3/C4 class and the M4 class.
### Configuring InfluxDB OSS instances
This example assumes that you are using two SSD volumes and that you have mounted them appropriately.
This example also assumes that each of those volumes is mounted at `/mnt/influx` and `/mnt/db`.
For more information on how to do that see the Amazon documentation on how to [Add a Volume to Your Instance](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-attaching-volume.html).
### Configuration file
You'll have to update the configuration file appropriately for each InfluxDB instance you have.
```
...
[meta]
dir = "/mnt/db/meta"
...
...
[data]
dir = "/mnt/db/data"
...
wal-dir = "/mnt/influx/wal"
...
...
[hinted-handoff]
...
dir = "/mnt/db/hh"
...
```
### Authentication and Authorization
For all AWS deployments, we strongly recommend authentication be enabled. Without this, it is possible that your InfluxDB
instance may be publicly available to any unauthenticated user. The default settings do **NOT** enable
authentication and authorization. Further, authentication and authorization should not be solely relied upon to prevent access
and protect data from malicious actors. If additional security or compliance features are desired, InfluxDB should be run
behind additional services offered by AWS.
Review the [authentication and authorization](/influxdb/v1.8/administration/authentication_and_authorization/) settings.
### InfluxDB OSS permissions
When using non-standard directories for InfluxDB data and configurations, also be sure to set filesystem permissions correctly:
```bash
chown influxdb:influxdb /mnt/influx
chown influxdb:influxdb /mnt/db
```
For InfluxDB 1.7.6 or later, you must give owner permissions to the `init.sh` file. To do this, run the following script in your `influxdb` directory:
```sh
if [ ! -f "$STDOUT" ]; then
mkdir -p $(dirname $STDOUT)
chown $USER:$GROUP $(dirname $STDOUT)
fi
if [ ! -f "$STDERR" ]; then
mkdir -p $(dirname $STDERR)
chown $USER:$GROUP $(dirname $STDERR)
fi
# Override init script variables with DEFAULT values
```

View File

@ -0,0 +1,73 @@
---
title: Influx Query Language (InfluxQL)
menu:
influxdb_1_8:
weight: 70
identifier: InfluxQL
---
This section introduces InfluxQL, the InfluxDB SQL-like query language for
working with data in InfluxDB databases.
## InfluxQL tutorial
The first seven documents in this section provide a tutorial-style introduction
to InfluxQL.
Feel free to download the dataset provided in
[Sample Data](/influxdb/v1.8/query_language/data_download/) and follow along
with the documentation.
#### [Data exploration](/influxdb/v1.8/query_language/data_exploration/)
Covers the query language basics for InfluxQL, including the
[`SELECT` statement](/influxdb/v1.8/query_language/data_exploration/#the-basic-select-statement),
[`GROUP BY` clauses](/influxdb/v1.8/query_language/data_exploration/#the-group-by-clause),
[`INTO` clauses](/influxdb/v1.8/query_language/data_exploration/#the-into-clause), and more.
See Data Exploration to learn about
[time syntax](/influxdb/v1.8/query_language/data_exploration/#time-syntax) and
[regular expressions](/influxdb/v1.8/query_language/data_exploration/#regular-expressions) in
queries.
#### [Schema exploration](/influxdb/v1.8/query_language/schema_exploration/)
Covers queries that are useful for viewing and exploring your
[schema](/influxdb/v1.8/concepts/glossary/#schema).
See Schema Exploration for syntax explanations and examples of InfluxQL's `SHOW`
queries.
#### [Database management](/influxdb/v1.8/query_language/database_management/)
Covers InfluxQL for managing
[databases](/influxdb/v1.8/concepts/glossary/#database) and
[retention policies](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) in
InfluxDB.
See Database Management for creating and dropping databases and retention
policies as well as deleting and dropping data.
#### [InfluxQL functions](/influxdb/v1.8/query_language/functions/)
Covers all [InfluxQL functions](/influxdb/v1.8/query_language/functions/).
#### [InfluxQL continuous queries](/influxdb/v1.8/query_language/continuous_queries/)
Covers the
[basic syntax](/influxdb/v1.8/query_language/continuous_queries/#basic-syntax)
,
[advanced syntax](/influxdb/v1.8/query_language/continuous_queries/#advanced-syntax)
,
and
[common use cases](/influxdb/v1.8/query_language/continuous_queries/#continuous-query-use-cases)
for
[continuous queries](/influxdb/v1.8/concepts/glossary/#continuous-query-cq).
This page also describes how to
[`SHOW`](/influxdb/v1.8/query_language/continuous_queries/#listing-continuous-queries) and
[`DROP`](/influxdb/v1.8/query_language/continuous_queries/#deleting-continuous-queries)
continuous queries.
#### [InfluxQL mathematical operators](/influxdb/v1.8/query_language/math_operators/)
Covers the use of mathematical operators in InfluxQL.
## [InfluxQL reference](/influxdb/v1.8/query_language/spec/)
The reference documentation for InfluxQL.

View File

@ -0,0 +1,985 @@
---
title: InfluxQL Continuous Queries
menu:
influxdb_1_8:
name: Continuous Queries
weight: 50
parent: InfluxQL
---
## Introduction
Continuous queries (CQ) are InfluxQL queries that run automatically and
periodically on realtime data and store query results in a
specified measurement.
<table style="width:100%">
<tr>
<td><a href="#basic-syntax">Basic Syntax</a></td>
<td><a href="#advanced-syntax">Advanced Syntax</a></td>
<td><a href="#continuous-query-management">CQ Management</a></td>
</tr>
<tr>
<td><a href="#examples-of-basic-syntax">Examples of Basic Syntax</a></td>
<td><a href="#examples-of-advanced-syntax">Examples of Advanced Syntax</a></td>
<td><a href="#continuous-query-use-cases">CQ Use Cases</a></td>
</tr>
<tr>
<td><a href="#common-issues-with-basic-syntax">Common Issues with Basic Syntax</a></td>
<td><a href="#common-issues-with-advanced-syntax">Common Issues with Advanced Syntax</a></td>
<td><a href="#further-information">Further information</a></td>
</tr>
</table>
## Syntax
### Basic syntax
```sql
CREATE CONTINUOUS QUERY <cq_name> ON <database_name>
BEGIN
<cq_query>
END
```
#### Description of basic syntax
##### The `cq_query`
The `cq_query` requires a
[function](/influxdb/v1.8/concepts/glossary/#function),
an [`INTO` clause](/influxdb/v1.8/query_language/spec/#clauses),
and a [`GROUP BY time()` clause](/influxdb/v1.8/query_language/spec/#clauses):
```sql
SELECT <function[s]> INTO <destination_measurement> FROM <measurement> [WHERE <stuff>] GROUP BY time(<interval>)[,<tag_key[s]>]
```
>**Note:** Notice that the `cq_query` does not require a time range in a `WHERE` clause.
InfluxDB automatically generates a time range for the `cq_query` when it executes the CQ.
Any user-specified time ranges in the `cq_query`'s `WHERE` clause will be ignored
by the system.
##### Schedule and coverage
Continuous queries operate on real-time data.
They use the local servers timestamp, the `GROUP BY time()` interval, and
InfluxDB database's preset time boundaries to determine when to execute and what time
range to cover in the query.
CQs execute at the same interval as the `cq_query`'s `GROUP BY time()` interval,
and they run at the start of the InfluxDB database's preset time boundaries.
If the `GROUP BY time()` interval is one hour, the CQ executes at the start of
every hour.
When the CQ executes, it runs a single query for the time range between
[`now()`](/influxdb/v1.8/concepts/glossary/#now) and `now()` minus the
`GROUP BY time()` interval.
If the `GROUP BY time()` interval is one hour and the current time is 17:00,
the query's time range is between 16:00 and 16:59.999999999.
#### Examples of basic syntax
The examples below use the following sample data in the `transportation`
database.
The measurement `bus_data` stores 15-minute resolution data on the number of bus
`passengers` and `complaints`:
```sql
name: bus_data
--------------
time passengers complaints
2016-08-28T07:00:00Z 5 9
2016-08-28T07:15:00Z 8 9
2016-08-28T07:30:00Z 8 9
2016-08-28T07:45:00Z 7 9
2016-08-28T08:00:00Z 8 9
2016-08-28T08:15:00Z 15 7
2016-08-28T08:30:00Z 15 7
2016-08-28T08:45:00Z 17 7
2016-08-28T09:00:00Z 20 7
```
##### Automatically downsampling data
Use a simple CQ to automatically downsample data from a single field
and write the results to another measurement in the same database.
```sql
CREATE CONTINUOUS QUERY "cq_basic" ON "transportation"
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h)
END
```
`cq_basic` calculates the average hourly number of passengers from the
`bus_data` measurement and stores the results in the `average_passengers`
measurement in the `transportation` database.
`cq_basic` executes at one-hour intervals, the same interval as the
`GROUP BY time()` interval.
Every hour, `cq_basic` runs a single query that covers the time range between
`now()` and `now()` minus the `GROUP BY time()` interval, that is, the time
range between `now()` and one hour prior to `now()`.
Annotated log output on the morning of August 28, 2016:
```sql
>
At **8:00** `cq_basic` executes a query with the time range `time >= '7:00' AND time < '08:00'`.
`cq_basic` writes one point to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 7
>
At **9:00** `cq_basic` executes a query with the time range `time >= '8:00' AND time < '9:00'`.
`cq_basic` writes one point to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T08:00:00Z 13.75
```
Here are the results:
```sql
> SELECT * FROM "average_passengers"
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 7
2016-08-28T08:00:00Z 13.75
```
##### Automatically downsampling data into another retention policy
[Fully qualify](/influxdb/v1.8/query_language/data_exploration/#the-basic-select-statement)
the destination measurement to store the downsampled data in a non-`DEFAULT`
[retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) (RP).
```sql
CREATE CONTINUOUS QUERY "cq_basic_rp" ON "transportation"
BEGIN
SELECT mean("passengers") INTO "transportation"."three_weeks"."average_passengers" FROM "bus_data" GROUP BY time(1h)
END
```
`cq_basic_rp` calculates the average hourly number of passengers from the
`bus_data` measurement and stores the results in the `transportation` database,
the `three_weeks` RP, and the `average_passengers` measurement.
`cq_basic_rp` executes at one-hour intervals, the same interval as the
`GROUP BY time()` interval.
Every hour, `cq_basic_rp` runs a single query that covers the time range between
`now()` and `now()` minus the `GROUP BY time()` interval, that is, the time
range between `now()` and one hour prior to `now()`.
Annotated log output on the morning of August 28, 2016:
```sql
>
At **8:00** `cq_basic_rp` executes a query with the time range `time >= '7:00' AND time < '8:00'`.
`cq_basic_rp` writes one point to the `three_weeks` RP and the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 7
>
At **9:00** `cq_basic_rp` executes a query with the time range
`time >= '8:00' AND time < '9:00'`.
`cq_basic_rp` writes one point to the `three_weeks` RP and the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T08:00:00Z 13.75
```
Here are the results:
```sql
> SELECT * FROM "transportation"."three_weeks"."average_passengers"
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 7
2016-08-28T08:00:00Z 13.75
```
`cq_basic_rp` uses CQs and retention policies to automatically downsample data
and keep those downsampled data for an alternative length of time.
See the [Downsampling and Data Retention](/influxdb/v1.8/guides/downsampling_and_retention/)
guide for an in-depth discussion about this CQ use case.
##### Automatically downsampling a database with backreferencing
Use a function with a wildcard (`*`) and `INTO` query's
[backreferencing syntax](/influxdb/v1.8/query_language/data_exploration/#the-into-clause)
to automatically downsample data from all measurements and numerical fields in
a database.
```sql
CREATE CONTINUOUS QUERY "cq_basic_br" ON "transportation"
BEGIN
SELECT mean(*) INTO "downsampled_transportation"."autogen".:MEASUREMENT FROM /.*/ GROUP BY time(30m),*
END
```
`cq_basic_br` calculates the 30-minute average of `passengers` and `complaints`
from every measurement in the `transportation` database (in this case, there's only the
`bus_data` measurement).
It stores the results in the `downsampled_transportation` database.
`cq_basic_br` executes at 30 minutes intervals, the same interval as the
`GROUP BY time()` interval.
Every 30 minutes, `cq_basic_br` runs a single query that covers the time range
between `now()` and `now()` minus the `GROUP BY time()` interval, that is,
the time range between `now()` and 30 minutes prior to `now()`.
Annotated log output on the morning of August 28, 2016:
```sql
>
At **7:30**, `cq_basic_br` executes a query with the time range `time >= '7:00' AND time < '7:30'`.
`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database:
>
name: bus_data
--------------
time mean_complaints mean_passengers
2016-08-28T07:00:00Z 9 6.5
>
At **8:00**, `cq_basic_br` executes a query with the time range `time >= '7:30' AND time < '8:00'`.
`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database:
>
name: bus_data
--------------
time mean_complaints mean_passengers
2016-08-28T07:30:00Z 9 7.5
>
[...]
>
At **9:00**, `cq_basic_br` executes a query with the time range `time >= '8:30' AND time < '9:00'`.
`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database:
>
name: bus_data
--------------
time mean_complaints mean_passengers
2016-08-28T08:30:00Z 7 16
```
Here are the results:
```sql
> SELECT * FROM "downsampled_transportation."autogen"."bus_data"
name: bus_data
--------------
time mean_complaints mean_passengers
2016-08-28T07:00:00Z 9 6.5
2016-08-28T07:30:00Z 9 7.5
2016-08-28T08:00:00Z 8 11.5
2016-08-28T08:30:00Z 7 16
```
##### Automatically downsampling data and configuring CQ time boundaries
Use an
[offset interval](/influxdb/v1.8/query_language/data_exploration/#advanced-group-by-time-syntax)
in the `GROUP BY time()` clause to alter both the CQ's default execution time and
preset time boundaries.
```sql
CREATE CONTINUOUS QUERY "cq_basic_offset" ON "transportation"
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h,15m)
END
```
`cq_basic_offset`calculates the average hourly number of passengers from the
`bus_data` measurement and stores the results in the `average_passengers`
measurement.
`cq_basic_offset` executes at one-hour intervals, the same interval as the
`GROUP BY time()` interval.
The 15 minute offset interval forces the CQ to execute 15 minutes after the
default execution time; `cq_basic_offset` executes at 8:15 instead of 8:00.
Every hour, `cq_basic_offset` runs a single query that covers the time range
between `now()` and `now()` minus the `GROUP BY time()` interval, that is, the
time range between `now()` and one hour prior to `now()`.
The 15 minute offset interval shifts forward the generated preset time boundaries in the
CQ's `WHERE` clause; `cq_basic_offset` queries between 7:15 and 8:14.999999999 instead of 7:00 and 7:59.999999999.
Annotated log output on the morning of August 28, 2016:
```
>
At **8:15** `cq_basic_offset` executes a query with the time range `time >= '7:15' AND time < '8:15'`.
`cq_basic_offset` writes one point to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T07:15:00Z 7.75
>
At **9:15** `cq_basic_offset` executes a query with the time range `time >= '8:15' AND time < '9:15'`.
`cq_basic_offset` writes one point to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T08:15:00Z 16.75
```
Here are the results:
```sql
> SELECT * FROM "average_passengers"
name: average_passengers
------------------------
time mean
2016-08-28T07:15:00Z 7.75
2016-08-28T08:15:00Z 16.75
```
Notice that the timestamps are for 7:15 and 8:15 instead of 7:00 and 8:00.
#### Common issues with basic syntax
##### Handling time intervals with no data
CQs do not write any results for a time interval if no data fall within that
time range.
Note that the basic syntax does not support using
[`fill()`](/influxdb/v1.8/query_language/data_exploration/#group-by-time-intervals-and-fill)
to change the value reported for intervals with no data.
Basic syntax CQs ignore `fill()` if it's included in the CQ query.
A possible workaround is to use the
[advanced CQ syntax](#example-4-configuring-cq-time-ranges-and-filling-empty-results).
##### Resampling previous time intervals
The basic CQ runs a single query that covers the time range between `now()`
and `now()` minus the `GROUP BY time()` interval.
See the [advanced syntax](#advanced-syntax) for how to configure the query's
time range.
##### Backfilling results for older data
CQs operate on realtime data, that is, data with timestamps that occur
relative to [`now()`](/influxdb/v1.8/concepts/glossary/#now).
Use a basic
[`INTO` query](/influxdb/v1.8/query_language/data_exploration/#the-into-clause)
to backfill results for data with older timestamps.
##### Missing tags in the CQ results
By default, all
[`INTO` queries](/influxdb/v1.8/query_language/data_exploration/#the-into-clause)
convert any tags in the source measurement to fields in the destination
measurement.
Include `GROUP BY *` in the CQ to preserve tags in the destination measurement.
### Advanced syntax
```txt
CREATE CONTINUOUS QUERY <cq_name> ON <database_name>
RESAMPLE EVERY <interval> FOR <interval>
BEGIN
<cq_query>
END
```
#### Description of advanced syntax
##### The `cq_query`
See [ Description of Basic Syntax](/influxdb/v1.8/query_language/continuous_queries/#description-of-basic-syntax).
##### Scheduling and coverage
CQs operate on real-time data. With the advanced syntax, CQs use the local
servers timestamp, the information in the `RESAMPLE` clause, and the InfluxDB
server's preset time boundaries to determine when to execute and what time range to
cover in the query.
CQs execute at the same interval as the `EVERY` interval in the `RESAMPLE`
clause, and they run at the start of InfluxDBs preset time boundaries.
If the `EVERY` interval is two hours, InfluxDB executes the CQ at the top of
every other hour.
When the CQ executes, it runs a single query for the time range between
[`now()`](/influxdb/v1.8/concepts/glossary/#now) and `now()` minus the `FOR` interval in the `RESAMPLE` clause.
If the `FOR` interval is two hours and the current time is 17:00, the query's
time range is between 15:00 and 16:59.999999999.
Both the `EVERY` interval and the `FOR` interval accept
[duration literals](/influxdb/v1.8/query_language/spec/#durations).
The `RESAMPLE` clause works with either or both of the `EVERY` and `FOR` intervals
configured.
CQs default to the relevant
[basic syntax behavior](/influxdb/v1.8/query_language/continuous_queries/#description-of-basic-syntax)
if the `EVERY` interval or `FOR` interval is not provided (see the first issue in
[Common Issues with Advanced Syntax](/influxdb/v1.8/query_language/continuous_queries/#common-issues-with-advanced-syntax)
for an anomalous case).
#### Examples of advanced syntax
The examples below use the following sample data in the `transportation` database.
The measurement `bus_data` stores 15-minute resolution data on the number of bus
`passengers`:
```sql
name: bus_data
--------------
time passengers
2016-08-28T06:30:00Z 2
2016-08-28T06:45:00Z 4
2016-08-28T07:00:00Z 5
2016-08-28T07:15:00Z 8
2016-08-28T07:30:00Z 8
2016-08-28T07:45:00Z 7
2016-08-28T08:00:00Z 8
2016-08-28T08:15:00Z 15
2016-08-28T08:30:00Z 15
2016-08-28T08:45:00Z 17
2016-08-28T09:00:00Z 20
```
##### Configuring execution intervals
Use an `EVERY` interval in the `RESAMPLE` clause to specify the CQ's execution
interval.
```sql
CREATE CONTINUOUS QUERY "cq_advanced_every" ON "transportation"
RESAMPLE EVERY 30m
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h)
END
```
`cq_advanced_every` calculates the one-hour average of `passengers`
from the `bus_data` measurement and stores the results in the
`average_passengers` measurement in the `transportation` database.
`cq_advanced_every` executes at 30-minute intervals, the same interval as the
`EVERY` interval.
Every 30 minutes, `cq_advanced_every` runs a single query that covers the time
range for the current time bucket, that is, the one-hour time bucket that
intersects with `now()`.
Annotated log output on the morning of August 28, 2016:
```sql
>
At **8:00**, `cq_advanced_every` executes a query with the time range `WHERE time >= '7:00' AND time < '8:00'`.
`cq_advanced_every` writes one point to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 7
>
At **8:30**, `cq_advanced_every` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`.
`cq_advanced_every` writes one point to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T08:00:00Z 12.6667
>
At **9:00**, `cq_advanced_every` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`.
`cq_advanced_every` writes one point to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T08:00:00Z 13.75
```
Here are the results:
```sql
> SELECT * FROM "average_passengers"
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 7
2016-08-28T08:00:00Z 13.75
```
Notice that `cq_advanced_every` calculates the result for the 8:00 time interval
twice.
First, it runs at 8:30 and calculates the average for every available data point
between 8:00 and 9:00 (`8`,`15`, and `15`).
Second, it runs at 9:00 and calculates the average for every available data
point between 8:00 and 9:00 (`8`, `15`, `15`, and `17`).
Because of the way InfluxDB
[handles duplicate points](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)
, the second result simply overwrites the first result.
##### Configuring time ranges for resampling
Use a `FOR` interval in the `RESAMPLE` clause to specify the length of the CQ's
time range.
```sql
CREATE CONTINUOUS QUERY "cq_advanced_for" ON "transportation"
RESAMPLE FOR 1h
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m)
END
```
`cq_advanced_for` calculates the 30-minute average of `passengers`
from the `bus_data` measurement and stores the results in the `average_passengers`
measurement in the `transportation` database.
`cq_advanced_for` executes at 30-minute intervals, the same interval as the
`GROUP BY time()` interval.
Every 30 minutes, `cq_advanced_for` runs a single query that covers the time
range between `now()` and `now()` minus the `FOR` interval, that is, the time
range between `now()` and one hour prior to `now()`.
Annotated log output on the morning of August 28, 2016:
```sql
>
At **8:00** `cq_advanced_for` executes a query with the time range `WHERE time >= '7:00' AND time < '8:00'`.
`cq_advanced_for` writes two points to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 6.5
2016-08-28T07:30:00Z 7.5
>
At **8:30** `cq_advanced_for` executes a query with the time range `WHERE time >= '7:30' AND time < '8:30'`.
`cq_advanced_for` writes two points to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T07:30:00Z 7.5
2016-08-28T08:00:00Z 11.5
>
At **9:00** `cq_advanced_for` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`.
`cq_advanced_for` writes two points to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T08:00:00Z 11.5
2016-08-28T08:30:00Z 16
```
Notice that `cq_advanced_for` will calculate the result for every time interval
twice.
The CQ calculates the average for the 7:30 time interval at 8:00 and at 8:30,
and it calculates the average for the 8:00 time interval at 8:30 and 9:00.
Here are the results:
```sql
> SELECT * FROM "average_passengers"
name: average_passengers
------------------------
time mean
2016-08-28T07:00:00Z 6.5
2016-08-28T07:30:00Z 7.5
2016-08-28T08:00:00Z 11.5
2016-08-28T08:30:00Z 16
```
##### Configuring execution intervals and CQ time ranges
Use an `EVERY` interval and `FOR` interval in the `RESAMPLE` clause to specify
the CQ's execution interval and the length of the CQ's time range.
```sql
CREATE CONTINUOUS QUERY "cq_advanced_every_for" ON "transportation"
RESAMPLE EVERY 1h FOR 90m
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m)
END
```
`cq_advanced_every_for` calculates the 30-minute average of
`passengers` from the `bus_data` measurement and stores the results in the
`average_passengers` measurement in the `transportation` database.
`cq_advanced_every_for` executes at one-hour intervals, the same interval as the
`EVERY` interval.
Every hour, `cq_advanced_every_for` runs a single query that covers the time
range between `now()` and `now()` minus the `FOR` interval, that is, the time
range between `now()` and 90 minutes prior to `now()`.
Annotated log output on the morning of August 28, 2016:
```sql
>
At **8:00** `cq_advanced_every_for` executes a query with the time range `WHERE time >= '6:30' AND time < '8:00'`.
`cq_advanced_every_for` writes three points to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T06:30:00Z 3
2016-08-28T07:00:00Z 6.5
2016-08-28T07:30:00Z 7.5
>
At **9:00** `cq_advanced_every_for` executes a query with the time range `WHERE time >= '7:30' AND time < '9:00'`.
`cq_advanced_every_for` writes three points to the `average_passengers` measurement:
>
name: average_passengers
------------------------
time mean
2016-08-28T07:30:00Z 7.5
2016-08-28T08:00:00Z 11.5
2016-08-28T08:30:00Z 16
```
Notice that `cq_advanced_every_for` will calculate the result for every time
interval twice.
The CQ calculates the average for the 7:30 interval at 8:00 and 9:00.
Here are the results:
```sql
> SELECT * FROM "average_passengers"
name: average_passengers
------------------------
time mean
2016-08-28T06:30:00Z 3
2016-08-28T07:00:00Z 6.5
2016-08-28T07:30:00Z 7.5
2016-08-28T08:00:00Z 11.5
2016-08-28T08:30:00Z 16
```
##### Configuring CQ time ranges and filling empty results
Use a `FOR` interval and `fill()` to change the value reported for time
intervals with no data.
Note that at least one data point must fall within the `FOR` interval for `fill()`
to operate.
If no data fall within the `FOR` interval the CQ writes no points to the
destination measurement.
```sql
CREATE CONTINUOUS QUERY "cq_advanced_for_fill" ON "transportation"
RESAMPLE FOR 2h
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h) fill(1000)
END
```
`cq_advanced_for_fill` calculates the one-hour average of `passengers` from the
`bus_data` measurement and stores the results in the `average_passengers`
measurement in the `transportation` database.
Where possible, it writes the value `1000` for time intervals with no results.
`cq_advanced_for_fill` executes at one-hour intervals, the same interval as the
`GROUP BY time()` interval.
Every hour, `cq_advanced_for_fill` runs a single query that covers the time
range between `now()` and `now()` minus the `FOR` interval, that is, the time
range between `now()` and two hours prior to `now()`.
Annotated log output on the morning of August 28, 2016:
```sql
>
At **6:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '4:00' AND time < '6:00'`.
`cq_advanced_for_fill` writes nothing to `average_passengers`; `bus_data` has no data
that fall within that time range.
>
At **7:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '5:00' AND time < '7:00'`.
`cq_advanced_for_fill` writes two points to `average_passengers`:
>
name: average_passengers
------------------------
time mean
2016-08-28T05:00:00Z 1000 <------ fill(1000)
2016-08-28T06:00:00Z 3 <------ average of 2 and 4
>
[...]
>
At **11:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '9:00' AND time < '11:00'`.
`cq_advanced_for_fill` writes two points to `average_passengers`:
>
name: average_passengers
------------------------
2016-08-28T09:00:00Z 20 <------ average of 20
2016-08-28T10:00:00Z 1000 <------ fill(1000)
>
```
At **12:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '10:00' AND time < '12:00'`.
`cq_advanced_for_fill` writes nothing to `average_passengers`; `bus_data` has no data
that fall within that time range.
Here are the results:
```sql
> SELECT * FROM "average_passengers"
name: average_passengers
------------------------
time mean
2016-08-28T05:00:00Z 1000
2016-08-28T06:00:00Z 3
2016-08-28T07:00:00Z 7
2016-08-28T08:00:00Z 13.75
2016-08-28T09:00:00Z 20
2016-08-28T10:00:00Z 1000
```
> **Note:** `fill(previous)` doesnt fill the result for a time interval if the
previous value is outside the querys time range.
See [Frequently Asked Questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#why-does-fill-previous-return-empty-results)
for more information.
#### Common issues with advanced syntax
##### If the `EVERY` interval is greater than the `GROUP BY time()` interval
If the `EVERY` interval is greater than the `GROUP BY time()` interval, the CQ
executes at the same interval as the `EVERY` interval and runs a single query
that covers the time range between `now()` and `now()` minus the `EVERY`
interval (not between `now()` and `now()` minus the `GROUP BY time()` interval).
For example, if the `GROUP BY time()` interval is `5m` and the `EVERY` interval
is `10m`, the CQ executes every ten minutes.
Every ten minutes, the CQ runs a single query that covers the time range
between `now()` and `now()` minus the `EVERY` interval, that is, the time
range between `now()` and ten minutes prior to `now()`.
This behavior is intentional and prevents the CQ from missing data between
execution times.
##### If the `FOR` interval is less than the execution interval
If the `FOR` interval is less than the `GROUP BY time()` interval or, if
specified, the `EVERY` interval, InfluxDB returns the following error:
```sql
error parsing query: FOR duration must be >= GROUP BY time duration: must be a minimum of <minimum-allowable-interval> got <user-specified-interval>
```
To avoid missing data between execution times, the `FOR` interval must be equal
to or greater than the `GROUP BY time()` interval or, if specified, the `EVERY`
interval.
Currently, this is the intended behavior.
GitHub Issue [#6963](https://github.com/influxdata/influxdb/issues/6963)
outlines a feature request for CQs to support gaps in data coverage.
## Continuous query management
Only admin users are allowed to work with CQs. For more on user privileges, see [Authentication and Authorization](/influxdb/v1.8/administration/authentication_and_authorization/#user-types-and-privileges).
### Listing continuous queries
List every CQ on an InfluxDB instance with:
```sql
SHOW CONTINUOUS QUERIES
```
`SHOW CONTINUOUS QUERIES` groups results by database.
##### Examples
The output shows that the `telegraf` and `mydb` databases have CQs:
```sql
> SHOW CONTINUOUS QUERIES
name: _internal
---------------
name query
name: telegraf
--------------
name query
idle_hands CREATE CONTINUOUS QUERY idle_hands ON telegraf BEGIN SELECT min(usage_idle) INTO telegraf.autogen.min_hourly_cpu FROM telegraf.autogen.cpu GROUP BY time(1h) END
feeling_used CREATE CONTINUOUS QUERY feeling_used ON telegraf BEGIN SELECT mean(used) INTO downsampled_telegraf.autogen.:MEASUREMENT FROM telegraf.autogen./.*/ GROUP BY time(1h) END
name: downsampled_telegraf
--------------------------
name query
name: mydb
----------
name query
vampire CREATE CONTINUOUS QUERY vampire ON mydb BEGIN SELECT count(dracula) INTO mydb.autogen.all_of_them FROM mydb.autogen.one GROUP BY time(5m) END
```
### Deleting continuous queries
Delete a CQ from a specific database with:
```sql
DROP CONTINUOUS QUERY <cq_name> ON <database_name>
```
`DROP CONTINUOUS QUERY` returns an empty result.
##### Examples
Drop the `idle_hands` CQ from the `telegraf` database:
```sql
> DROP CONTINUOUS QUERY "idle_hands" ON "telegraf"`
>
```
### Altering continuous queries
CQs cannot be altered once they're created.
To change a CQ, you must `DROP` and re`CREATE` it with the updated settings.
### Continuous query statistics
If `query-stats-enabled` is set to `true` in your `influxdb.conf` or using the `INFLUXDB_CONTINUOUS_QUERIES_QUERY_STATS_ENABLED` environment variable, data will be written to `_internal` with information about when continuous queries ran and their duration.
Information about CQ configuration settings is available in the [Configuration](/influxdb/v1.8/administration/config/#continuous-queries-settings) documentation.
> **Note:** `_internal` houses internal system data and is meant for internal use.
The structure of and data stored in `_internal` can change at any time.
Use of this data falls outside the scope of official InfluxData support.
## Continuous query use cases
### Downsampling and Data Retention
Use CQs with InfluxDB database
[retention policies](/influxdb/v1.8/concepts/glossary/#retention-policy-rp)
(RPs) to mitigate storage concerns.
Combine CQs and RPs to automatically downsample high precision data to a lower
precision and remove the dispensable, high precision data from the database.
See the
[Downsampling and data retention](/influxdb/v1.8/guides/downsampling_and_retention/)
guide for a detailed walkthrough of this common use case.
### Precalculating expensive queries
Shorten query runtimes by pre-calculating expensive queries with CQs.
Use a CQ to automatically downsample commonly-queried, high precision data to a
lower precision.
Queries on lower precision data require fewer resources and return faster.
**Tip:** Pre-calculate queries for your preferred graphing tool to accelerate
the population of graphs and dashboards.
### Substituting for a `HAVING` clause
InfluxQL does not support [`HAVING` clauses](https://en.wikipedia.org/wiki/Having_%28SQL%29).
Get the same functionality by creating a CQ to aggregate the data and querying
the CQ results to apply the `HAVING` clause.
> **Note:** InfluxQL supports [subqueries](/influxdb/v1.8/query_language/data_exploration/#subqueries) which also offer similar functionality to `HAVING` clauses.
See [Data Exploration](/influxdb/v1.8/query_language/data_exploration/#subqueries) for more information.
##### Example
InfluxDB does not accept the following query with a `HAVING` clause.
The query calculates the average number of `bees` at `30` minute intervals and
requests averages that are greater than `20`.
```sql
SELECT mean("bees") FROM "farm" GROUP BY time(30m) HAVING mean("bees") > 20
```
To get the same results:
**1. Create a CQ**
This step performs the `mean("bees")` part of the query above.
Because this step creates CQ you only need to execute it once.
The following CQ automatically calculates the average number of `bees` at
`30` minutes intervals and writes those averages to the `mean_bees` field in the
`aggregate_bees` measurement.
```sql
CREATE CONTINUOUS QUERY "bee_cq" ON "mydb" BEGIN SELECT mean("bees") AS "mean_bees" INTO "aggregate_bees" FROM "farm" GROUP BY time(30m) END
```
**2. Query the CQ results**
This step performs the `HAVING mean("bees") > 20` part of the query above.
Query the data in the measurement `aggregate_bees` and request values of the `mean_bees` field that are greater than `20` in the `WHERE` clause:
```sql
SELECT "mean_bees" FROM "aggregate_bees" WHERE "mean_bees" > 20
```
### Substituting for nested functions
Some InfluxQL functions
[support nesting](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#which-influxql-functions-support-nesting)
of other functions.
Most do not.
If your function does not support nesting, you can get the same functionality using a CQ to calculate
the inner-most function.
Then simply query the CQ results to calculate the outer-most function.
> **Note:** InfluxQL supports [subqueries](/influxdb/v1.8/query_language/data_exploration/#subqueries) which also offer the same functionality as nested functions.
See [Data Exploration](/influxdb/v1.8/query_language/data_exploration/#subqueries) for more information.
##### Example
InfluxDB does not accept the following query with a nested function.
The query calculates the number of non-null values
of `bees` at `30` minute intervals and the average of those counts:
```sql
SELECT mean(count("bees")) FROM "farm" GROUP BY time(30m)
```
To get the same results:
**1. Create a CQ**
This step performs the `count("bees")` part of the nested function above.
Because this step creates a CQ you only need to execute it once.
The following CQ automatically calculates the number of non-null values of `bees` at `30` minute intervals
and writes those counts to the `count_bees` field in the `aggregate_bees` measurement.
```sql
CREATE CONTINUOUS QUERY "bee_cq" ON "mydb" BEGIN SELECT count("bees") AS "count_bees" INTO "aggregate_bees" FROM "farm" GROUP BY time(30m) END
```
**2. Query the CQ results**
This step performs the `mean([...])` part of the nested function above.
Query the data in the measurement `aggregate_bees` to calculate the average of the
`count_bees` field:
```sql
SELECT mean("count_bees") FROM "aggregate_bees" WHERE time >= <start_time> AND time <= <end_time>
```
## Further information
To see how to combine two InfluxDB features, CQs, and retention policies,
to periodically downsample data and automatically expire the dispensable high
precision data, see [Downsampling and data retention](/influxdb/v1.8/guides/downsampling_and_retention/).
Kapacitor, InfluxData's data processing engine, can do the same work as
continuous queries in InfluxDB databases.
To learn when to use Kapacitor instead of InfluxDB and how to perform the same CQ
functionality with a TICKscript, see [examples of continuous queries in Kapacitor](/kapacitor/latest/examples/continuous_queries/).

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,369 @@
---
title: Manage your database using InfluxQL
description: Use InfluxQL to administer your InfluxDB server and work with InfluxDB databases, retention policies, series, measurements, and shards.
menu:
influxdb_1_8:
name: Manage your database
weight: 40
parent: InfluxQL
aliases:
- /influxdb/v1.8/query_language/database_management/
---
InfluxQL offers a full suite of administrative commands.
<table style="width:100%">
<tr>
<td><b>Data Management:</b></td>
<td><b>Retention Policy Management:</br></td>
</tr>
<tr>
<td><a href="#create-database">CREATE DATABASE</a></td>
<td><a href="#create-retention-policies-with-create-retention-policy">CREATE RETENTION POLICY</a></td>
</tr>
<tr>
<td><a href="#delete-a-database-with-drop-database">DROP DATABASE</a></td>
<td><a href="#modify-retention-policies-with-alter-retention-policy">ALTER RETENTION POLICY</a></td>
</tr>
<tr>
<td><a href="#drop-series-from-the-index-with-drop-series">DROP SERIES</a></td>
<td><a href="#delete-retention-policies-with-drop-retention-policy">DROP RETENTION POLICY</a></td>
</tr>
<tr>
<td><a href="#delete-series-with-delete">DELETE</a></td>
<td></td>
</tr>
<tr>
<td><a href="#delete-measurements-with-drop-measurement">DROP MEASUREMENT</a></td>
<td></td>
</tr>
<tr>
<td><a href="#delete-a-shard-with-drop-shard">DROP SHARD</a></td>
<td></td>
</tr>
</table>
If you're looking for `SHOW` queries (for example, `SHOW DATABASES` or `SHOW RETENTION POLICIES`), see [Schema Exploration](/influxdb/v1.8/query_language/schema_exploration).
The examples in the sections below use the InfluxDB [Command Line Interface (CLI)](/influxdb/v1.8/introduction/getting-started/).
You can also execute the commands using the InfluxDB API; simply send a `GET` request to the `/query` endpoint and include the command in the URL parameter `q`.
For more on using the InfluxDB API, see [Querying data](/influxdb/v1.8/guides/querying_data/).
> **Note:** When authentication is enabled, only admin users can execute most of the commands listed on this page.
> See the documentation on [authentication and authorization](/influxdb/v1.8/administration/authentication_and_authorization/) for more information.
## Data management
### CREATE DATABASE
Creates a new database.
#### Syntax
```sql
CREATE DATABASE <database_name> [WITH [DURATION <duration>] [REPLICATION <n>] [SHARD DURATION <duration>] [NAME <retention-policy-name>]]
```
#### Description of syntax
`CREATE DATABASE` requires a database [name](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#what-words-and-characters-should-i-avoid-when-writing-data-to-influxdb).
The `WITH`, `DURATION`, `REPLICATION`, `SHARD DURATION`, and `NAME` clauses are optional and create a single [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) associated with the created database.
If you do not specify one of the clauses after `WITH`, the relevant behavior defaults to the `autogen` retention policy settings.
The created retention policy automatically serves as the database's default retention policy.
For more information about those clauses, see [Retention Policy Management](/influxdb/v1.8/query_language/database_management/#retention-policy-management).
A successful `CREATE DATABASE` query returns an empty result.
If you attempt to create a database that already exists, InfluxDB does nothing and does not return an error.
#### Examples
##### Create a database
```
> CREATE DATABASE "NOAA_water_database"
>
```
The query creates a database called `NOAA_water_database`.
[By default](/influxdb/v1.8/administration/config/#retention-autocreate-true), InfluxDB also creates the `autogen` retention policy and associates it with the `NOAA_water_database`.
##### Create a database with a specific retention policy
```
> CREATE DATABASE "NOAA_water_database" WITH DURATION 3d REPLICATION 1 SHARD DURATION 1h NAME "liquid"
>
```
The query creates a database called `NOAA_water_database`.
It also creates a default retention policy for `NOAA_water_database` with a `DURATION` of three days, a [replication factor](/influxdb/v1.8/concepts/glossary/#replication-factor) of one, a [shard group](/influxdb/v1.8/concepts/glossary/#shard-group) duration of one hour, and with the name `liquid`.
### Delete a database with DROP DATABASE
The `DROP DATABASE` query deletes all of the data, measurements, series, continuous queries, and retention policies from the specified database.
The query takes the following form:
```sql
DROP DATABASE <database_name>
```
Drop the database NOAA_water_database:
```bash
> DROP DATABASE "NOAA_water_database"
>
```
A successful `DROP DATABASE` query returns an empty result.
If you attempt to drop a database that does not exist, InfluxDB does not return an error.
### Drop series from the index with DROP SERIES
The `DROP SERIES` query deletes all points from a [series](/influxdb/v1.8/concepts/glossary/#series) in a database,
and it drops the series from the index.
> **Note:** `DROP SERIES` does not support time intervals in the `WHERE` clause.
See
[`DELETE`](/influxdb/v1.8/query_language/database_management/#delete-series-with-delete)
for that functionality.
The query takes the following form, where you must specify either the `FROM` clause or the `WHERE` clause:
```sql
DROP SERIES FROM <measurement_name[,measurement_name]> WHERE <tag_key>='<tag_value>'
```
Drop all series from a single measurement:
```sql
> DROP SERIES FROM "h2o_feet"
```
Drop series with a specific tag pair from a single measurement:
```sql
> DROP SERIES FROM "h2o_feet" WHERE "location" = 'santa_monica'
```
Drop all points in the series that have a specific tag pair from all measurements in the database:
```sql
> DROP SERIES WHERE "location" = 'santa_monica'
```
A successful `DROP SERIES` query returns an empty result.
### Delete series with DELETE
The `DELETE` query deletes all points from a
[series](/influxdb/v1.8/concepts/glossary/#series) in a database.
Unlike
[`DROP SERIES`](/influxdb/v1.8/query_language/database_management/#drop-series-from-the-index-with-drop-series), it does not drop the series from the index and it supports time intervals
in the `WHERE` clause.
The query takes the following form where you must include either the `FROM`
clause or the `WHERE` clause, or both:
```
DELETE FROM <measurement_name> WHERE [<tag_key>='<tag_value>'] | [<time interval>]
```
Delete all data associated with the measurement `h2o_feet`:
```
> DELETE FROM "h2o_feet"
```
Delete all data associated with the measurement `h2o_quality` and where the tag `randtag` equals `3`:
```
> DELETE FROM "h2o_quality" WHERE "randtag" = '3'
```
Delete all data in the database that occur before January 01, 2016:
```
> DELETE WHERE time < '2016-01-01'
```
A successful `DELETE` query returns an empty result.
Things to note about `DELETE`:
* `DELETE` supports
[regular expressions](/influxdb/v1.8/query_language/data_exploration/#regular-expressions)
in the `FROM` clause when specifying measurement names and in the `WHERE` clause
when specifying tag values.
* `DELETE` does not support [fields](/influxdb/v1.8/concepts/glossary/#field) in the `WHERE` clause.
* If you need to delete points in the future, you must specify that time period as `DELETE SERIES` runs for `time < now()` by default. [Syntax](https://github.com/influxdata/influxdb/issues/8007)
### Delete measurements with DROP MEASUREMENT
The `DROP MEASUREMENT` query deletes all data and series from the specified [measurement](/influxdb/v1.8/concepts/glossary/#measurement) and deletes the
measurement from the index.
The query takes the following form:
```sql
DROP MEASUREMENT <measurement_name>
```
Delete the measurement `h2o_feet`:
```sql
> DROP MEASUREMENT "h2o_feet"
```
> **Note:** `DROP MEASUREMENT` drops all data and series in the measurement.
It does not drop the associated continuous queries.
A successful `DROP MEASUREMENT` query returns an empty result.
{{% warn %}} Currently, InfluxDB does not support regular expressions with `DROP MEASUREMENTS`.
See GitHub Issue [#4275](https://github.com/influxdb/influxdb/issues/4275) for more information.
{{% /warn %}}
### Delete a shard with DROP SHARD
The `DROP SHARD` query deletes a shard. It also drops the shard from the
[metastore](/influxdb/v1.8/concepts/glossary/#metastore).
The query takes the following form:
```sql
DROP SHARD <shard_id_number>
```
Delete the shard with the id `1`:
```
> DROP SHARD 1
>
```
A successful `DROP SHARD` query returns an empty result.
InfluxDB does not return an error if you attempt to drop a shard that does not
exist.
## Retention policy management
The following sections cover how to create, alter, and delete retention policies.
Note that when you create a database, InfluxDB automatically creates a retention policy named `autogen` which has infinite retention.
You may disable its auto-creation in the [configuration file](/influxdb/v1.8/administration/config/#metastore-settings).
### Create retention policies with CREATE RETENTION POLICY
#### Syntax
```
CREATE RETENTION POLICY <retention_policy_name> ON <database_name> DURATION <duration> REPLICATION <n> [SHARD DURATION <duration>] [DEFAULT]
```
#### Description of syntax
##### `DURATION`
- The `DURATION` clause determines how long InfluxDB keeps the data.
The `<duration>` is a [duration literal](/influxdb/v1.8/query_language/spec/#durations)
or `INF` (infinite).
The minimum duration for a retention policy is one hour and the maximum
duration is `INF`.
##### `REPLICATION`
- The `REPLICATION` clause determines how many independent copies of each point
are stored in the [cluster](/influxdb/v1.8/high_availability/clusters/).
- By default, the replication factor `n` usually equals the number of data nodes. However, if you have four or more data nodes, the default replication factor `n` is 3.
- To ensure data is immediately available for queries, set the replication factor `n` to less than or equal to the number of data nodes in the cluster.
> **Important:** If you have four or more data nodes, verify that the database replication factor is correct.
- Replication factors do not serve a purpose with single node instances.
##### `SHARD DURATION`
- Optional. The `SHARD DURATION` clause determines the time range covered by a [shard group](/influxdb/v1.8/concepts/glossary/#shard-group).
- The `<duration>` is a [duration literal](/influxdb/v1.8/query_language/spec/#durations)
and does not support an `INF` (infinite) duration.
- By default, the shard group duration is determined by the retention policy's
`DURATION`:
| Retention Policy's DURATION | Shard Group Duration |
|---|---|
| < 2 days | 1 hour |
| >= 2 days and <= 6 months | 1 day |
| > 6 months | 7 days |
The minimum allowable `SHARD GROUP DURATION` is `1h`.
If the `CREATE RETENTION POLICY` query attempts to set the `SHARD GROUP DURATION` to less than `1h` and greater than `0s`, InfluxDB automatically sets the `SHARD GROUP DURATION` to `1h`.
If the `CREATE RETENTION POLICY` query attempts to set the `SHARD GROUP DURATION` to `0s`, InfluxDB automatically sets the `SHARD GROUP DURATION` according to the default settings listed above.
See
[Shard group duration management](/influxdb/v1.8/concepts/schema_and_data_layout/#shard-group-duration-management)
for recommended configurations.
##### `DEFAULT`
Sets the new retention policy as the default retention policy for the database.
This setting is optional.
#### Examples
##### Create a retention policy
```
> CREATE RETENTION POLICY "one_day_only" ON "NOAA_water_database" DURATION 1d REPLICATION 1
>
```
The query creates a retention policy called `one_day_only` for the database
`NOAA_water_database` with a one day duration and a replication factor of one.
##### Create a DEFAULT retention policy
```sql
> CREATE RETENTION POLICY "one_day_only" ON "NOAA_water_database" DURATION 23h60m REPLICATION 1 DEFAULT
>
```
The query creates the same retention policy as the one in the example above, but
sets it as the default retention policy for the database.
A successful `CREATE RETENTION POLICY` query returns an empty response.
If you attempt to create a retention policy identical to one that already exists, InfluxDB does not return an error.
If you attempt to create a retention policy with the same name as an existing retention policy but with differing attributes, InfluxDB returns an error.
> **Note:** You can also specify a new retention policy in the `CREATE DATABASE` query.
See [Create a database with CREATE DATABASE](/influxdb/v1.8/query_language/database_management/#create-database).
### Modify retention policies with ALTER RETENTION POLICY
The `ALTER RETENTION POLICY` query takes the following form, where you must declare at least one of the retention policy attributes `DURATION`, `REPLICATION`, `SHARD DURATION`, or `DEFAULT`:
```sql
ALTER RETENTION POLICY <retention_policy_name> ON <database_name> DURATION <duration> REPLICATION <n> SHARD DURATION <duration> DEFAULT
```
{{% warn %}} Replication factors do not serve a purpose with single node instances.
{{% /warn %}}
First, create the retention policy `what_is_time` with a `DURATION` of two days:
```sql
> CREATE RETENTION POLICY "what_is_time" ON "NOAA_water_database" DURATION 2d REPLICATION 1
>
```
Modify `what_is_time` to have a three week `DURATION`, a two hour shard group duration, and make it the `DEFAULT` retention policy for `NOAA_water_database`.
```sql
> ALTER RETENTION POLICY "what_is_time" ON "NOAA_water_database" DURATION 3w SHARD DURATION 2h DEFAULT
>
```
In the last example, `what_is_time` retains its original replication factor of 1.
A successful `ALTER RETENTION POLICY` query returns an empty result.
### Delete retention policies with DROP RETENTION POLICY
Delete all measurements and data in a specific retention policy:
{{% warn %}}
Dropping a retention policy will permanently delete all measurements and data stored in the retention policy.
{{% /warn %}}
```sql
DROP RETENTION POLICY <retention_policy_name> ON <database_name>
```
Delete the retention policy `what_is_time` in the `NOAA_water_database` database:
```bash
> DROP RETENTION POLICY "what_is_time" ON "NOAA_water_database"
>
```
A successful `DROP RETENTION POLICY` query returns an empty result.
If you attempt to drop a retention policy that does not exist, InfluxDB does not return an error.

View File

@ -0,0 +1,307 @@
---
title: InfluxQL mathematical operators
menu:
influxdb_1_8:
name: Mathematical operators
weight: 70
parent: InfluxQL
---
Mathematical operators follow the [standard order of operations](https://golang.org/ref/spec#Operator_precedence).
That is, parentheses take precedence to division and multiplication, which takes precedence to addition and subtraction.
For example `5 / 2 + 3 * 2 = (5 / 2) + (3 * 2)` and `5 + 2 * 3 - 2 = 5 + (2 * 3) - 2`.
### Content
* [Mathematical Operators](#mathematical-operators)
* [Addition](#addition)
* [Subtraction](#subtraction)
* [Multiplication](#multiplication)
* [Division](#division)
* [Modulo](#modulo)
* [Bitwise AND](#bitwise-and)
* [Bitwise OR](#bitwise-or)
* [Bitwise Exclusive-OR](#bitwise-exclusive-or)
* [Common Issues with Mathematical Operators](#common-issues-with-mathematical-operators)
* [Unsupported Operators](#unsupported-operators)
## Mathematical Operators
### Addition
Perform addition with a constant.
```sql
SELECT "A" + 5 FROM "add"
```
```sql
SELECT * FROM "add" WHERE "A" + 5 > 10
```
Perform addition on two fields.
```sql
SELECT "A" + "B" FROM "add"
```
```sql
SELECT * FROM "add" WHERE "A" + "B" >= 10
```
### Subtraction
Perform subtraction with a constant.
```sql
SELECT 1 - "A" FROM "sub"
```
```sql
SELECT * FROM "sub" WHERE 1 - "A" <= 3
```
Perform subtraction with two fields.
```sql
SELECT "A" - "B" FROM "sub"
```
```sql
SELECT * FROM "sub" WHERE "A" - "B" <= 1
```
### Multiplication
Perform multiplication with a constant.
```sql
SELECT 10 * "A" FROM "mult"
```
```sql
SELECT * FROM "mult" WHERE "A" * 10 >= 20
```
Perform multiplication with two fields.
```sql
SELECT "A" * "B" * "C" FROM "mult"
```
```sql
SELECT * FROM "mult" WHERE "A" * "B" <= 80
```
Multiplication distributes across other operators.
```sql
SELECT 10 * ("A" + "B" + "C") FROM "mult"
```
```sql
SELECT 10 * ("A" - "B" - "C") FROM "mult"
```
```sql
SELECT 10 * ("A" + "B" - "C") FROM "mult"
```
### Division
Perform division with a constant.
```sql
SELECT 10 / "A" FROM "div"
```
```sql
SELECT * FROM "div" WHERE "A" / 10 <= 2
```
Perform division with two fields.
```sql
SELECT "A" / "B" FROM "div"
```
```sql
SELECT * FROM "div" WHERE "A" / "B" >= 10
```
Division distributes across other operators.
```sql
SELECT 10 / ("A" + "B" + "C") FROM "mult"
```
### Modulo
Perform modulo arithmetic with a constant.
```
SELECT "B" % 2 FROM "modulo"
```
```
SELECT "B" FROM "modulo" WHERE "B" % 2 = 0
```
Perform modulo arithmetic on two fields.
```
SELECT "A" % "B" FROM "modulo"
```
```
SELECT "A" FROM "modulo" WHERE "A" % "B" = 0
```
### Bitwise AND
You can use this operator with any integers or Booleans, whether they are fields or constants.
It does not work with float or string datatypes, and you cannot mix integers and Booleans.
```sql
SELECT "A" & 255 FROM "bitfields"
```
```sql
SELECT "A" & "B" FROM "bitfields"
```
```sql
SELECT * FROM "data" WHERE "bitfield" & 15 > 0
```
```sql
SELECT "A" & "B" FROM "booleans"
```
```sql
SELECT ("A" ^ true) & "B" FROM "booleans"
```
### Bitwise OR
You can use this operator with any integers or Booleans, whether they are fields or constants.
It does not work with float or string datatypes, and you cannot mix integers and Booleans.
```sql
SELECT "A" | 5 FROM "bitfields"
```
```sql
SELECT "A" | "B" FROM "bitfields"
```
```sql
SELECT * FROM "data" WHERE "bitfield" | 12 = 12
```
### Bitwise Exclusive-OR
You can use this operator with any integers or Booleans, whether they are fields or constants.
It does not work with float or string datatypes, and you cannot mix integers and Booleans.
```sql
SELECT "A" ^ 255 FROM "bitfields"
```
```sql
SELECT "A" ^ "B" FROM "bitfields"
```
```sql
SELECT * FROM "data" WHERE "bitfield" ^ 6 > 0
```
### Common Issues with Mathematical Operators
#### Issue 1: Mathematical operators with wildcards and regular expressions
InfluxDB does not support combining mathematical operations with a wildcard (`*`) or [regular expression](/influxdb/v1.8/query_language/data_exploration/#regular-expressions) in the `SELECT` clause.
The following queries are invalid and the system returns an error:
Perform a mathematical operation on a wildcard.
```
> SELECT * + 2 FROM "nope"
ERR: unsupported expression with wildcard: * + 2
```
Perform a mathematical operation on a wildcard within a function.
```
> SELECT COUNT(*) / 2 FROM "nope"
ERR: unsupported expression with wildcard: count(*) / 2
```
Perform a mathematical operation on a regular expression.
```
> SELECT /A/ + 2 FROM "nope"
ERR: error parsing query: found +, expected FROM at line 1, char 12
```
Perform a mathematical operation on a regular expression within a function.
```
> SELECT COUNT(/A/) + 2 FROM "nope"
ERR: unsupported expression with regex field: count(/A/) + 2
```
#### Issue 2: Mathematical operators with functions
The use of mathematical operators inside of function calls is currently unsupported.
Note that InfluxDB only allows functions in the `SELECT` clause.
For example
```sql
SELECT 10 * mean("value") FROM "cpu"
```
will work, however
```sql
SELECT mean(10 * "value") FROM "cpu"
```
will yield a parse error.
> InfluxQL supports [subqueries](/influxdb/v1.8/query_language/data_exploration/#subqueries) which offer similar functionality to using mathematical operators inside a function call.
See [Data Exploration](/influxdb/v1.8/query_language/data_exploration/#subqueries) for more information.
## Unsupported Operators
### Inequalities
Using any of `=`,`!=`,`<`,`>`,`<=`,`>=`,`<>` in the `SELECT` clause yields empty results for all types.
See GitHub issue [3525](https://github.com/influxdb/influxdb/issues/3525).
### Logical Operators
Using any of `!|`,`NAND`,`XOR`,`NOR` yield a parser error.
Additionally using `AND`, `OR` in the `SELECT` clause of a query will not behave as mathematical operators and simply yield empty results, as they are tokens in InfluxQL.
However, you can apply the bitwise operators `&`, `|` and `^` to Boolean data.
### Bitwise Not
There is no bitwise-not operator, because the results you expect depend on the width of your bitfield.
InfluxQL does not know how wide your bitfield is, so cannot implement a suitable bitwise-not operator.
For example, if your bitfield is 8 bits wide, then to you the integer 1 represents the bits `0000 0001`.
The bitwise-not of this should return the bits `1111 1110`, i.e. the integer 254.
However, if your bitfield is 16 bits wide, then the integer 1 represents the bits `0000 0000 0000 0001`.
The bitwise-not of this should return the bits `1111 1111 1111 1110`, i.e. the integer 65534.
#### Solution
You can implement a bitwise-not operation by using the `^` (bitwise xor) operator together with the number representing all-ones for your word-width:
For 8-bit data:
```sql
SELECT "A" ^ 255 FROM "data"
```
For 16-bit data:
```sql
SELECT "A" ^ 65535 FROM "data"
```
For 32-bit data:
```sql
SELECT "A" ^ 4294967295 FROM "data"
```
In each case the constant you need can be calculated as `(2 ** width) - 1`.

View File

@ -0,0 +1,122 @@
---
title: Sample data
menu:
influxdb_1_8:
weight: 10
parent: InfluxQL
aliases:
- /influxdb/v1.8/sample_data/data_download/
- /influxdb/v1.8/query_language/data_download/
---
In order to explore the query language further, these instructions help you create a database,
download and write data to that database within your InfluxDB installation.
The sample data is then used and referenced in [Data Exploration](../../query_language/data_exploration/),
[Schema Exploration](../../query_language/schema_exploration/), and [Functions](../../query_language/functions/).
## Creating a database
If you've installed InfluxDB locally, the `influx` command should be available via the command line.
Executing `influx` will start the CLI and automatically connect to the local InfluxDB instance
(assuming you have already started the server with `service influxdb start` or by running `influxd` directly).
The output should look like this:
```bash
$ influx -precision rfc3339
Connected to http://localhost:8086 version 1.4.x
InfluxDB shell 1.4.x
>
```
> **Notes:**
>
* The InfluxDB API runs on port `8086` by default.
Therefore, `influx` will connect to port `8086` and `localhost` by default.
If you need to alter these defaults, run `influx --help`.
* The [`-precision` argument](/influxdb/latest/tools/shell/#influx-options) specifies the format/precision of any returned timestamps.
In the example above, `rfc3339` tells InfluxDB to return timestamps in [RFC3339 format](https://www.ietf.org/rfc/rfc3339.txt) (`YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ`).
The command line is now ready to take input in the form of the Influx Query Language (a.k.a InfluxQL) statements.
To exit the InfluxQL shell, type `exit` and hit return.
A fresh install of InfluxDB has no databases (apart from the system `_internal`),
so creating one is our first task.
You can create a database with the `CREATE DATABASE <db-name>` InfluxQL statement,
where `<db-name>` is the name of the database you wish to create.
Names of databases can contain any unicode character as long as the string is double-quoted.
Names can also be left unquoted if they contain _only_ ASCII letters,
digits, or underscores and do not begin with a digit.
Throughout the query language exploration, we'll use the database name `NOAA_water_database`:
```
> CREATE DATABASE NOAA_water_database
> exit
```
### Download and write the data to InfluxDB
From your terminal, download the text file that contains the data in [line protocol](/influxdb/v1.8/concepts/glossary/#influxdb-line-protocol) format:
```
curl https://s3.amazonaws.com/noaa.water-database/NOAA_data.txt -o NOAA_data.txt
```
Write the data to InfluxDB via the [CLI](../../tools/shell/):
```
influx -import -path=NOAA_data.txt -precision=s -database=NOAA_water_database
```
### Test queries
```bash
$ influx -precision rfc3339 -database NOAA_water_database
Connected to http://localhost:8086 version 1.4.x
InfluxDB shell 1.4.x
>
```
See all five measurements:
```bash
> SHOW measurements
name: measurements
------------------
name
average_temperature
h2o_feet
h2o_pH
h2o_quality
h2o_temperature
```
Count the number of non-null values of `water_level` in `h2o_feet`:
```bash
> SELECT COUNT("water_level") FROM h2o_feet
name: h2o_feet
--------------
time count
1970-01-01T00:00:00Z 15258
```
Select the first five observations in the measurement h2o_feet:
```bash
> SELECT * FROM h2o_feet LIMIT 5
name: h2o_feet
--------------
time level description location water_level
2015-08-18T00:00:00Z below 3 feet santa_monica 2.064
2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12
2015-08-18T00:06:00Z between 6 and 9 feet coyote_creek 8.005
2015-08-18T00:06:00Z below 3 feet santa_monica 2.116
2015-08-18T00:12:00Z between 6 and 9 feet coyote_creek 7.887
```
### Data sources and things to note
The sample data is publicly available data from the [National Oceanic and Atmospheric Administrations (NOAA) Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html?type=Water+Levels).
The data include 15,258 observations of water levels (ft) collected every six minutes at two stations (Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period from August 18, 2015 through September 18, 2015.
Note that the measurements `average_temperature`, `h2o_pH`, `h2o_quality`, and `h2o_temperature` contain fictional data.
Those measurements serve to illuminate query functionality in [Schema Exploration](../../query_language/schema_exploration/).
The `h2o_feet` measurement is the only measurement that contains the NOAA data.
Please note that the `level description` field isn't part of the original NOAA data - we snuck it in there for the sake of having a field key with a special character and string [field values](../../concepts/glossary/#field-value).

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,27 @@
---
title: Supported protocols in InfluxDB
menu:
influxdb_1_8:
name: Supported protocols
weight: 90
---
InfluxData supports the following protocols for interacting with InfluxDB:
### [CollectD](/influxdb/v1.8/supported_protocols/collectd)
Using the collectd input, InfluxDB can accept data transmitted in collectd native format. This data is transmitted over UDP.
### [Graphite](/influxdb/v1.8/supported_protocols/graphite)
The Graphite plugin allows measurements to be saved using the Graphite line protocol. By default, enabling the Graphite plugin will allow you to collect metrics and store them using the metric name as the measurement.
### [OpenTSDB](/influxdb/v1.8/supported_protocols/opentsdb)
InfluxDB supports both the telnet and HTTP OpenTSDB protocol.
This means that InfluxDB can act as a drop-in replacement for your OpenTSDB system.
### [Prometheus](/influxdb/v1.8/supported_protocols/prometheus)
InfluxDB provides native support for the Prometheus read and write API to convert remote reads and writes to InfluxDB queries and endpoints.
### [UDP](/influxdb/v1.8/supported_protocols/udp)
UDP (User Datagram Protocol) can be used to write to InfluxDB. The CollectD input accepts data transmitted in collectd native format over UDP.

View File

@ -0,0 +1,52 @@
---
title: CollectD protocol support in InfluxDB
aliases:
- /influxdb/v1.8/tools/collectd/
menu:
influxdb_1_8:
name: CollectD
weight: 10
parent: Supported protocols
---
# The collectd input
The [collectd](https://collectd.org) input allows InfluxDB to accept data transmitted in collectd native format. This data is transmitted over UDP.
## A note on UDP/IP buffer sizes
If you're running Linux or FreeBSD, please adjust your operating system UDP buffer size limit, [see here for more details.](/influxdb/latest/supported_protocols/udp/#a-note-on-udpip-os-buffer-sizes)
## Configuration
Each collectd input allows the binding address, target database, and target retention policy to be set. If the database does not exist, it will be created automatically when the input is initialized. If the retention policy is not configured, then the default retention policy for the database is used. However if the retention policy is set, the retention policy must be explicitly created. The input will not automatically create it.
Each collectd input also performs internal batching of the points it receives, as batched writes to the database are more efficient. The default batch size is 1000, pending batch factor is 5, with a batch timeout of 1 second. This means the input will write batches of maximum size 1000, but if a batch has not reached 1000 points within 1 second of the first point being added to a batch, it will emit that batch regardless of size. The pending batch factor controls how many batches can be in memory at once, allowing the input to transmit a batch, while still building other batches.
Multi-value plugins can be handled two ways. Setting parse-multivalue-plugin to "split" will parse and store the multi-value plugin data (e.g., df free:5000,used:1000) into separate measurements (e.g., (df_free, value=5000) (df_used, value=1000)), while "join" will parse and store the multi-value plugin as a single multi-value measurement (e.g., (df, free=5000,used=1000)). "split" is the default behavior for backward compatibility with previous versions of influxdb.
The path to the collectd types database file may also be set.
## Large UDP packets
Please note that UDP packets larger than the standard size of 1452 are dropped at the time of ingestion. Be sure to set `MaxPacketSize` to 1452 in the collectd configuration.
## Config Example
```
[[collectd]]
enabled = true
bind-address = ":25826" # the bind address
database = "collectd" # Name of the database that will be written to
retention-policy = ""
batch-size = 5000 # will flush if this many points get buffered
batch-pending = 10 # number of batches that may be pending in memory
batch-timeout = "10s"
read-buffer = 0 # UDP read buffer size, 0 means to use OS default
typesdb = "/usr/share/collectd/types.db"
security-level = "none" # "none", "sign", or "encrypt"
auth-file = "/etc/collectd/auth_file"
parse-multivalue-plugin = "split" # "split" or "join"
```
Content from [README](https://github.com/influxdata/influxdb/tree/1.8/services/collectd/README.md) on GitHub.

View File

@ -0,0 +1,210 @@
---
title: Graphite protocol support in InfluxDB
aliases:
- /influxdb/v1.8/tools/graphite/
- /influxdb/v1.8/write_protocols/graphite/
menu:
influxdb_1_8:
name: Graphite
weight: 20
parent: Supported protocols
---
# The Graphite Input
## A Note On UDP/IP OS Buffer Sizes
If you're using UDP input and running Linux or FreeBSD, please adjust your UDP buffer
size limit, [see here for more details.](/influxdb/v1.8/supported_protocols/udp#a-note-on-udp-ip-os-buffer-sizes)
## Configuration
Each Graphite input allows the binding address, target database, and protocol to be set. If the database does not exist, it will be created automatically when the input is initialized. The write-consistency-level can also be set. If any write operations do not meet the configured consistency guarantees, an error will occur and the data will not be indexed. The default consistency-level is `ONE`.
Each Graphite input also performs internal batching of the points it receives, as batched writes to the database are more efficient. The default _batch size_ is 1000, _pending batch_ factor is 5, with a _batch timeout_ of 1 second. This means the input will write batches of maximum size 1000, but if a batch has not reached 1000 points within 1 second of the first point being added to a batch, it will emit that batch regardless of size. The pending batch factor controls how many batches can be in memory at once, allowing the input to transmit a batch, while still building other batches.
## Parsing Metrics
The Graphite plugin allows measurements to be saved using the Graphite line protocol. By default, enabling the Graphite plugin will allow you to collect metrics and store them using the metric name as the measurement. If you send a metric named `servers.localhost.cpu.loadavg.10`, it will store the full metric name as the measurement with no extracted tags.
While this default setup works, it is not the ideal way to store measurements in InfluxDB since it does not take advantage of tags. It also will not perform optimally with large dataset sizes since queries will be forced to use regexes which is known to not scale well.
To extract tags from metrics, one or more templates must be configured to parse metrics into tags and measurements.
## Templates
Templates allow matching parts of a metric name to be used as tag keys in the stored metric. They have a similar format to Graphite metric names. The values in between the separators are used as the tag keys. The location of the tag key that matches the same position as the Graphite metric section is used as the value. If there is no value, the Graphite portion is skipped.
The special value _measurement_ is used to define the measurement name. It can have a trailing `*` to indicate that the remainder of the metric should be used. If a _measurement_ is not specified, the full metric name is used.
### Basic Matching
`servers.localhost.cpu.loadavg.10`
* Template: `.host.resource.measurement*`
* Output: _measurement_ =`loadavg.10` _tags_ =`host=localhost resource=cpu`
### Multiple Measurement & Tags Matching
The _measurement_ can be specified multiple times in a template to provide more control over the measurement name. Tags can also be
matched multiple times. Multiple values will be joined together using the _Separator_ config variable. By default, this value is `.`.
`servers.localhost.localdomain.cpu.cpu0.user`
* Template: `.host.host.measurement.cpu.measurement`
* Output: _measurement_ = `cpu.user` _tags_ = `host=localhost.localdomain cpu=cpu0`
Since `.` requires queries on measurements to be double-quoted, you may want to set this to `_` to simplify querying parsed metrics.
`servers.localhost.cpu.cpu0.user`
* Separator: `_`
* Template: `.host.measurement.cpu.measurement`
* Output: _measurement_ = `cpu_user` _tags_ = `host=localhost cpu=cpu0`
### Adding Tags
Additional tags can be added to a metric if they don't exist on the received metric. You can add additional tags by specifying them after the pattern. Tags have the same format as the line protocol. Multiple tags are separated by commas.
`servers.localhost.cpu.loadavg.10`
* Template: `.host.resource.measurement* region=us-west,zone=1a`
* Output: _measurement_ = `loadavg.10` _tags_ = `host=localhost resource=cpu region=us-west zone=1a`
### Fields
A field key can be specified by using the keyword _field_. By default if no _field_ keyword is specified then the metric will be written to a field named _value_.
The field key can also be derived from the second "half" of the input metric-name by specifying ```field*``` (eg ```measurement.measurement.field*```). This cannot be used in conjunction with "measurement*"!
It's possible to amend measurement metrics with additional fields, e.g:
Input:
```
sensu.metric.net.server0.eth0.rx_packets 461295119435 1444234982
sensu.metric.net.server0.eth0.tx_bytes 1093086493388480 1444234982
sensu.metric.net.server0.eth0.rx_bytes 1015633926034834 1444234982
sensu.metric.net.server0.eth0.tx_errors 0 1444234982
sensu.metric.net.server0.eth0.rx_errors 0 1444234982
sensu.metric.net.server0.eth0.tx_dropped 0 1444234982
sensu.metric.net.server0.eth0.rx_dropped 0 1444234982
```
With template:
```
sensu.metric.* ..measurement.host.interface.field
```
Becomes database entry:
```
> select * from net
name: net
---------
time host interface rx_bytes rx_dropped rx_errors rx_packets tx_bytes tx_dropped tx_errors
1444234982000000000 server0 eth0 1.015633926034834e+15 0 0 4.61295119435e+11 1.09308649338848e+15 0 0
```
## Multiple Templates
One template may not match all metrics. For example, using multiple plugins with diamond will produce metrics in different formats. If you need to use multiple templates, you'll need to define a prefix filter that must match before the template can be applied.
### Filters
Filters have a similar format to templates but work more like wildcard expressions. When multiple filters would match a metric, the more specific one is chosen. Filters are configured by adding them before the template.
For example,
```
servers.localhost.cpu.loadavg.10
servers.host123.elasticsearch.cache_hits 100
servers.host456.mysql.tx_count 10
servers.host789.prod.mysql.tx_count 10
```
* `servers.*` would match all values
* `servers.*.mysql` would match `servers.host456.mysql.tx_count 10`
* `servers.localhost.*` would match `servers.localhost.cpu.loadavg`
* `servers.*.*.mysql` would match `servers.host789.prod.mysql.tx_count 10`
## Default Templates
If no template filters are defined or you want to just have one basic template, you can define a default template. This template will apply to any metric that has not already matched a filter.
```
dev.http.requests.200
prod.myapp.errors.count
dev.db.queries.count
```
* `env.app.measurement*` would create
* _measurement_=`requests.200` _tags_=`env=dev,app=http`
* _measurement_= `errors.count` _tags_=`env=prod,app=myapp`
* _measurement_=`queries.count` _tags_=`env=dev,app=db`
## Global Tags
If you need to add the same set of tags to all metrics, you can define them globally at the plugin level and not within each template description.
## Minimal Config
```
[[graphite]]
enabled = true
# bind-address = ":2003"
# protocol = "tcp"
# consistency-level = "one"
### If matching multiple measurement files, this string will be used to join the matched values.
# separator = "."
### Default tags that will be added to all metrics. These can be overridden at the template level
### or by tags extracted from metric
# tags = ["region=us-east", "zone=1c"]
### Each template line requires a template pattern. It can have an optional
### filter before the template and separated by spaces. It can also have optional extra
### tags following the template. Multiple tags should be separated by commas and no spaces
### similar to the line protocol format. The can be only one default template.
# templates = [
# "*.app env.service.resource.measurement",
# # Default template
# "server.*",
#]
```
## Customized Config
```
[[graphite]]
enabled = true
separator = "_"
tags = ["region=us-east", "zone=1c"]
templates = [
# filter + template
"*.app env.service.resource.measurement",
# filter + template + extra tag
"stats.* .host.measurement* region=us-west,agent=sensu",
# filter + template with field key
"stats.* .host.measurement.field",
# default template. Ignore the first Graphite component "servers"
".measurement*",
]
```
## Two Graphite Listeners, UDP & TCP, Config
```
[[graphite]]
enabled = true
bind-address = ":2003"
protocol = "tcp"
# consistency-level = "one"
[[graphite]]
enabled = true
bind-address = ":2004" # the bind address
protocol = "udp" # protocol to read via
udp-read-buffer = 8388608 # (8*1024*1024) UDP read buffer size
```
Content from [README](https://github.com/influxdata/influxdb/tree/1.8/services/graphite/README.md) on GitHub.

View File

@ -0,0 +1,27 @@
---
title: OpenTSDB protocol support in InfluxDB
aliases:
- /influxdb/v1.8/tools/opentsdb/
menu:
influxdb_1_8:
name: OpenTSDB
weight: 30
parent: Supported protocols
---
## OpenTSDB Input
InfluxDB supports both the telnet and HTTP OpenTSDB protocol. This means that InfluxDB can act as a drop-in replacement for your OpenTSDB system.
## Configuration
The OpenTSDB inputs allow the binding address, target database, and target retention policy within that database, to be set. If the database does not exist, it will be created automatically when the input is initialized. If you also decide to configure retention policy (without configuration the input will use the auto-created default retention policy), both the database and retention policy must already exist.
The `write-consistency-level` can also be set. If any write operations do not meet the configured consistency guarantees, an error will occur and the data will not be indexed. The default consistency-level is `ONE`.
The OpenTSDB input also performs internal batching of the points it receives, as batched writes to the database are more efficient. The default _batch size_ is 1000, _pending batch_ factor is 5, with a _batch timeout_ of 1 second. This means the input will write batches of maximum size 1000, but if a batch has not reached 1000 points within 1 second of the first point being added to a batch, it will emit that batch regardless of size. The pending batch factor controls how many batches can be in memory at once, allowing the input to transmit a batch, while still building other batches.
## Telegraf OpenTSDB output plugin
The [Telegraf OpenTSDB output plugin](https://github.com/influxdata/telegraf/blob/release-1.11/plugins/outputs/opentsdb/README.md)
outputs OpenTSDB protocol to an OpenTSDB endpoint.
Use the plugin to write to InfluxDB or other OpenTSDB-compatible endpoints.

View File

@ -0,0 +1,120 @@
---
title: Prometheus endpoints support in InfluxDB
menu:
influxdb_1_8:
name: Prometheus
weight: 40
parent: Supported protocols
---
## Prometheus remote read and write API support
{{% warn %}}
Note: The Prometheus [API Stability Guarantees](https://prometheus.io/docs/prometheus/latest/stability/)
states that remote read and remote write endpoints are features listed as experimental
or subject to change, and thus considered unstable for 2.x. Any breaking changes
will be included in the InfluxDB release notes.
{{% /warn %}}
InfluxDB support for the Prometheus remote read and write API adds the following
HTTP endpoints to InfluxDB:
* `/api/v1/prom/read`
* `/api/v1/prom/write`
Additionally, there is a [`/metrics` endpoint](/influxdb/v1.8/administration/server_monitoring/#influxdb-metrics-http-endpoint) configured to produce default Go metrics in Prometheus metrics format.
### Create a target database
Create a database in your InfluxDB instance to house data sent from Prometheus.
In the examples provided below, `prometheus` is used as the database name, but
you're welcome to use the whatever database name you like.
```sql
CREATE DATABASE "prometheus"
```
### Configuration
To enable the use of the Prometheus remote read and write APIs with InfluxDB, add URL
values to the following settings in the [Prometheus configuration file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file):
* [`remote_write`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cremote_write%3E)
* [`remote_read`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cremote_read%3E)
The URLs must be resolvable from your running Prometheus server and use the port
on which InfluxDB is running (`8086` by default).
Also include the database name using the `db=` query parameter.
#### Example: Endpoints in Prometheus configuration file
```yaml
remote_write:
- url: "http://localhost:8086/api/v1/prom/write?db=prometheus"
remote_read:
- url: "http://localhost:8086/api/v1/prom/read?db=prometheus"
```
#### Read and write URLs with authentication
If [authentication is enabled on InfluxDB](/influxdb/v1.8/administration/authentication_and_authorization/),
pass the `username` and `password` of an InfluxDB user with read and write privileges
using the `u=` and `p=` query parameters respectively.
##### Examples of endpoints with authentication enabled**_
```yaml
remote_write:
- url: "http://localhost:8086/api/v1/prom/write?db=prometheus&u=username&p=password"
remote_read:
- url: "http://localhost:8086/api/v1/prom/read?db=prometheus&u=username&p=password"
```
> Including plain text passwords in your Prometheus configuration file is not ideal.
> Unfortunately, environment variables and secrets are not supported in Prometheus configuration files.
> See this Prometheus issue for more information:
>
>[Support for environment variable substitution in configuration file](https://github.com/prometheus/prometheus/issues/2357)
## How Prometheus metrics are parsed in InfluxDB
As Prometheus data is brought into InfluxDB, the following transformations are
made to match the InfluxDB data structure:
* The Prometheus metric name becomes the InfluxDB [measurement](/influxdb/v1.8/concepts/key_concepts/#measurement) name.
* The Prometheus sample (value) becomes an InfluxDB field using the `value` field key. It is always a float.
* Prometheus labels become InfluxDB tags.
* All `# HELP` and `# TYPE` lines are ignored.
* [v1.8.6 and later] Prometheus remote write endpoint drops unsupported Prometheus values (`NaN`,`-Inf`, and `+Inf`) rather than reject the entire batch.
* If [write trace logging is enabled (`[http] write-tracing = true`)](/influxdb/v1.8/administration/config/#write-tracing-false), then summaries of dropped values are logged.
* If a batch of values contains values that are subsequently dropped, HTTP status code `204` is returned.
### Example: Parse Prometheus to InfluxDB
```shell
# Prometheus metric
example_metric{queue="0:http://example:8086/api/v1/prom/write?db=prometheus",le="0.005"} 308
# Same metric parsed into InfluxDB
measurement
example_metric
tags
queue = "0:http://example:8086/api/v1/prom/write?db=prometheus"
le = "0.005"
job = "prometheus"
instance = "localhost:9090"
__name__ = "example_metric"
fields
value = 308
```
> In InfluxDB v1.5 and earlier, all Prometheus data goes into a single measurement
> named `_` and the Prometheus measurement name is stored in the `__name__` label.
> In InfluxDB v1.6 or later, every Prometheus measurement gets its own InfluxDB measurement.
{{% warn %}}
This format is different than the format used by the [Telegraf Prometheus input plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/prometheus).
{{% /warn %}}

View File

@ -0,0 +1,141 @@
---
title: UDP protocol support in InfluxDB
aliases:
- /influxdb/v1.8/tools/udp/
- /influxdb/v1.8/write_protocols/udp/
menu:
influxdb_1_8:
name: UDP
weight: 50
parent: Supported protocols
---
# The UDP Input
## A note on UDP/IP OS buffer sizes
Some operating systems (most notably, Linux) place very restrictive limits on the performance of UDP protocols.
It is _highly_ recommended that you increase these OS limits to at least 25MB before trying to run UDP traffic to your instance.
25MB is just a recommendation, and should be adjusted to be inline with your
`read-buffer` plugin setting.
### Linux
Check the current UDP/IP receive buffer default and limit by typing the following commands:
```
sysctl net.core.rmem_max
sysctl net.core.rmem_default
```
If the values are less than 26214400 bytes (25MB), you should add the following lines to the `/etc/sysctl.conf` file:
```
net.core.rmem_max=26214400
net.core.rmem_default=26214400
```
Changes to `/etc/sysctl.conf` do not take effect until reboot. To update the values immediately, type the following commands as root:
```
sysctl -w net.core.rmem_max=26214400
sysctl -w net.core.rmem_default=26214400
```
### BSD/Darwin
On BSD/Darwin systems, you need to add about a 15% padding to the kernel limit
socket buffer.
For example, if you want a 25MB buffer (26214400 bytes) you need to set the kernel limit to `26214400*1.15 = 30146560`.
This is not documented anywhere but happens
[in the kernel here](https://github.com/freebsd/freebsd/blob/master/sys/kern/uipc_sockbuf.c#L63-L64).
#### Checking current UDP/IP buffer limits
To check the current UDP/IP buffer limit, type the following command:
```
sysctl kern.ipc.maxsockbuf
```
If the value is less than 30146560 bytes, you should add the following lines to the `/etc/sysctl.conf` file (create it if necessary):
```
kern.ipc.maxsockbuf=30146560
```
Changes to `/etc/sysctl.conf` do not take effect until reboot.
To update the values immediately, type the following command as root:
```
sysctl -w kern.ipc.maxsockbuf=30146560
```
### Using the `read-buffer` option for the UDP listener
The `read-buffer` option allows users to set the buffer size for the UDP listener.
It sets the size of the operating system's receive buffer associated with
the UDP traffic.
Keep in mind that the OS must be able to handle the number set here or the UDP listener will error and exit.
Setting `read-buffer = 0` results in the OS default being used and is usually too small for high UDP performance.
## Configuration
Each UDP input allows the binding address, target database, and target retention policy to be set. If the database does not exist, it will be created automatically when the input is initialized. If the retention policy is not configured, then the default retention policy for the database is used. However, if the retention policy is set, the retention policy must be explicitly created. The input will not automatically create it.
Each UDP input also performs internal batching of the points it receives, as batched writes to the database are more efficient. The default _batch size_ is 1000, _pending batch_ factor is 5, with a _batch timeout_ of 1 second. This means the input will write batches of maximum size 1000, but if a batch has not reached 1000 points within 1 second of the first point being added to a batch, it will emit that batch regardless of size. The pending batch factor controls how many batches can be in memory at once, allowing the input to transmit a batch, while still building other batches.
## Processing
The UDP input can receive up to 64KB per read, and splits the received data by newline. Each part is then interpreted as line-protocol encoded points, and parsed accordingly.
## UDP is connectionless
Since UDP is a connectionless protocol, there is no way to signal to the data source if any error occurs, and if data has even been successfully indexed. This should be kept in mind when deciding if and when to use the UDP input. The built-in UDP statistics are useful for monitoring the UDP inputs.
## Config examples
**One UDP listener**
```
# influxd.conf
...
[[udp]]
enabled = true
bind-address = ":8089" # the bind address
database = "telegraf" # Name of the database that will be written to
batch-size = 5000 # will flush if this many points get buffered
batch-timeout = "1s" # will flush at least this often even if the batch-size is not reached
batch-pending = 10 # number of batches that may be pending in memory
read-buffer = 0 # UDP read buffer, 0 means to use OS default
...
```
**Multiple UDP listeners**
```
# influxd.conf
...
[[udp]]
# Default UDP for Telegraf
enabled = true
bind-address = ":8089" # the bind address
database = "telegraf" # Name of the database that will be written to
batch-size = 5000 # will flush if this many points get buffered
batch-timeout = "1s" # will flush at least this often even if the batch-size is not reached
batch-pending = 10 # number of batches that may be pending in memory
read-buffer = 0 # UDP read buffer size, 0 means to use OS default
[[udp]]
# High-traffic UDP
enabled = true
bind-address = ":8189" # the bind address
database = "mymetrics" # Name of the database that will be written to
batch-size = 5000 # will flush if this many points get buffered
batch-timeout = "1s" # will flush at least this often even if the batch-size is not reached
batch-pending = 100 # number of batches that may be pending in memory
read-buffer = 8388608 # (8*1024*1024) UDP read buffer size
...
```
Content from [README](https://github.com/influxdata/influxdb/tree/1.8/services/udp/README.md)

View File

@ -0,0 +1,59 @@
---
title: InfluxDB tools
aliases:
- /influxdb/v1.8/clients/
- /influxdb/v1.8/write_protocols/json/
menu:
influxdb_1_8:
name: Tools
weight: 60
---
This section covers the available tools for interacting with InfluxDB.
## [`influx` command line interface (CLI)](/influxdb/v1.8/tools/influx-cli/)
The InfluxDB command line interface (`influx`) includes commands to manage many aspects of InfluxDB, including databases, organizations, users, and tasks.
## [`influxd` command](/influxdb/v1.8/tools/influxd)
The `influxd` command starts and runs all the processes necessary for InfluxDB to function.
## [InfluxDB API client libraries](/influxdb/v1.8/tools/api_client_libraries/)
The list of client libraries for interacting with the InfluxDB API.
## [Influx Inspect disk shard utility](/influxdb/v1.8/tools/influx_inspect/)
Influx Inspect is a tool designed to view detailed information about on disk shards, as well as export data from a shard to line protocol that can be inserted back into the database.
## [InfluxDB inch tool](/influxdb/v1.8/tools/inch/)
Use the InfluxDB inch tool to test InfluxDB performance. Adjust metrics such as the batch size, tag values, and concurrent write streams to test how ingesting different tag cardinalities and metrics affects performance.
## Graphs and dashboards
Use [Chronograf](/chronograf/latest/) or [Grafana](https://grafana.com/docs/grafana/latest/features/datasources/influxdb/) dashboards to visualize your time series data.
> **Tip:** Use template variables in your dashboards to filter meta query results by a specified period of time (see example below).
### Filter meta query results using template variables
The example below shows how to filter hosts retrieving data in the past hour.
##### Example
```sh
# Create a retention policy.
CREATE RETENTION POLICY "lookup" ON "prod" DURATION 1d REPLICATION 1
# Create a continuous query that groups by the tags you want to use in your template variables.
CREATE CONTINUOUS QUERY "lookupquery" ON "prod" BEGIN SELECT mean(value) as value INTO "your.system"."host_info" FROM "cpuload"
WHERE time > now() - 1h GROUP BY time(1h), host, team, status, location END;
# In your Grafana or Chronograf templates, include your tag values.
SHOW TAG VALUES FROM "your.system"."host_info" WITH KEY = “host”
```
> **Note:** In Chronograf, you can also filter meta query results for a specified time range by [creating a `custom meta query` template variable](/chronograf/latest/guides/dashboard-template-variables/#create-custom-template-variables) and adding a time range filter.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,107 @@
---
title: InfluxDB client libraries
description: InfluxDB client libraries includes support for Arduino, C#, Go, Java, JavaScript, PHP, Python, and Ruby.
aliases:
- /influxdb/v1.8/clients/api_client_libraries/
- /influxdb/v1.8/clients/
- /influxdb/v1.8/clients/api
menu:
influxdb_1_8:
weight: 30
parent: Tools
---
InfluxDB client libraries are language-specific packages that integrate with the InfluxDB 2.0 API and support both **InfluxDB 1.8+** and **InfluxDB 2.0**.
>**Note:** We recommend using the new client libraries on this page to leverage the new read (via Flux) and write APIs and prepare for conversion to InfluxDB 2.0 and InfluxDB Cloud 2.0. For more information, see [InfluxDB 2.0 API compatibility endpoints](/influxdb/v1.8/tools/api/#influxdb-2.0-compatibility-endpoints). Client libraries for [InfluxDB 1.7 and earlier](/influxdb/v1.7/tools/api_client_libraries/) may continue to work, but are not maintained by InfluxData.
## Client libraries
Functionality varies between client libraries. Refer to client libraries on GitHub for specifics regarding each client library.
### Arduino
- [InfluxDB Arduino Client](https://github.com/tobiasschuerg/InfluxDB-Client-for-Arduino)
- Contributed by [Tobias Schürg (tobiasschuerg)](https://github.com/tobiasschuerg)
### C\#
- [influxdb-client-csharp](https://github.com/influxdata/influxdb-client-csharp)
- Maintained by [InfluxData](https://github.com/influxdata)
### Go
- [influxdb-client-go](https://github.com/influxdata/influxdb-client-go)
- Maintained by [InfluxData](https://github.com/influxdata)
### Java
- [influxdb-client-java](https://github.com/influxdata/influxdb-client-java)
- Maintained by [InfluxData](https://github.com/influxdata)
### JavaScript
* [influxdb-javascript](https://github.com/influxdata/influxdb-client-js)
- Maintained by [InfluxData](https://github.com/influxdata)
### PHP
- [influxdb-client-php](https://github.com/influxdata/influxdb-client-php)
- Maintained by [InfluxData](https://github.com/influxdata)
### Python
* [influxdb-client-python](https://github.com/influxdata/influxdb-client-python)
- Maintained by [InfluxData](https://github.com/influxdata)
### Ruby
- [influxdb-client-ruby](https://github.com/influxdata/influxdb-client-ruby)
- Maintained by [InfluxData](https://github.com/influxdata)
## Install and use a client library
To install and use the Python client library, follow the [instructions below](#install-and-use-the-python-client-library). To install and use other client libraries, refer to the client library documentation for detail.
### Install and use the Python client library
1. Install the Python client library.
```sh
pip install influxdb-client
```
2. Ensure that InfluxDB is running. If running InfluxDB locally, visit http://localhost:8086. (If using InfluxDB Cloud, visit the URL of your InfluxDB Cloud UI.)
3. In your program, import the client library and use it to write data to InfluxDB. For example:
```sh
import influxdb_client
from influxdb_client.client.write_api import SYNCHRONOUS
```
4. Define your database and token variables, and create a client and writer object. The InfluxDBClient object takes 2 parameters: `url` and `token`
```sh
database = "<my-db>"
token = "<my-token>"
client = influxdb_client.InfluxDBClient(
url="http://localhost:8086",
token=token,
```
> **Note:** The database (and retention policy, if applicable) are converted to a [bucket](https://v2. docs.influxdata.com/v2.0/reference/glossary/#bucket) data store compatible with InfluxDB 2.0.
5. Instantiate a writer object using the client object and the write_api method. Use the `write_api` method to configure the writer object.
```sh
client = influxdb_client.InfluxDBClient(url=url, token=token)
write_api = client.write_api(write_options=SYNCHRONOUS)
```
6. Create a point object and write it to InfluxDB using the write method of the API writer object. The write method requires three parameters: database, (optional) retention policy, and record.
```sh
p = influxdb_client.Point("my_measurement").tag("location", "Prague").field("temperature", 25.3)
write_api.write(database:rp, record=p)
```

View File

@ -0,0 +1,11 @@
---
title: Grafana graphs and dashboards
menu:
influxdb_1_8:
url: "https://grafana.com/docs/grafana/latest/features/datasources/influxdb/"
weight: 60
parent: Tools
---
Please see [Grafana's InfluxDB documentation](https://grafana.com/docs/grafana/latest/features/datasources/influxdb/).

View File

@ -0,0 +1,75 @@
---
title: InfluxDB inch tool
description: Use the InfluxDB inch tool to test InfluxDB performance. Adjust the number of points and tag values to test ingesting different tag cardinalities.
menu:
influxdb_1_8:
weight: 50
parent: Tools
---
Use the InfluxDB inch tool to simulate streaming data to InfluxDB and measure your performance (for example, the impact of cardinality on write throughput). To do this, complete the following tasks:
- [Install InfluxDB inch](#install-influxdb-inch)
- [Use InfluxDB inch](#use-influxdb-inch)
## Install InfluxDB inch
1. To install `inch`, run the following command in your terminal:
```bash
$ go get github.com/influxdata/inch/cmd/inch
```
2. Verify `inch` is successfully installed in your `GOPATH/bin` (default on Unix `$HOME/go/bin`).
## Use InfluxDB inch
1. Log into the InfluxDB instance you want to test (for InfluxDB Enterprise, log into the data node(s) to test).
2. Run `inch`, specifying [`options`](#options) (metrics) to test (see [Options](#options) table below). For example, your syntax may look like this:
```bash
inch -v -c 8 -b 10000 -t 2,5000,1 -p 100000 -consistency any
```
This example starts generating a workload with:
- 8 concurrent (`-c`) write streams
- 10000 points per batch (`-b`)
- tag cardinality (`-t`) of 10000 unique series (2x5000x1)
- 10000 points (`-p`) per series
- any write `-consistency`
> **Note:** By default, `inch` writes generated test results to a database named `stress`. To change the name of the inch database, include the `-db string` option, for example, `inch -db test`.
3. To view the last 50 `inch` results, run the following query against the inch database:
```bash
> select * from stress limit 50
```
### Options
`inch` options listed in alphabetical order.
|Option | Description |Example |
|------------ | ---------- | ------- |
| `-b int` | batch size (default 5000; recommend between 5000-10000 points) | `-b 10000` |
| `-c int` | number of streams writing concurrently (default 1) | `-c 8` |
| `-consistency string` | write consistency (default "any"); values supported by the Influxdb API include "all", "quorum", or "one". | `-consistency any` |
| `-db string` | name of the database to write to (default "stress") | `-db stress` |
| `-delay duration` | delay between writes (in seconds `s`, minutes `m`, or hours `h`) | `-delay 1s` |
| `-dry` | dry run (maximum write performance `perf` possible on the specified database) | `-dry` |
| `-f int` | total unique field key-value pairs per point (default 1) | `-f 1` |
|`-host string` | host (default http<nolink>://localhost:8086") | `-host http://localhost:8086` |
| `-m int` | the number of measurements (default 1) | `-m 1` |
| `-max-errors int` | the number of InfluxDB errors that can occur before terminating the `inch` command | `-max-errors 5` |
| `-p int` | points per series (default 100) | `-p 100` |
| `-report-host string` | host to send metrics to | `report-host http://localhost:8086` |
| `-report-tags string` | comma-separated k=v (key-value?) tags to report alongside metrics | `-report-tags cpu=cpu1` |
| `-shard-duration string` | shard duration (default 7d) |`-shard-duration 7d` |
| `-t [string]`&ast;&ast; | comma-separated integers that represent tags. | `-t [100,20,4]` |
| `-target-latency duration` | if specified, attempt to adapt write delay to meet target. | |
| `-time duration` | time span to spread writes over. | `-time 1h` |
| `-v` | verbose; prints out details as you're running the test. | `-v` |
&ast;&ast; `-t [string]` each integer represents a **tag key** and the number of **tag values** to generate for the key (default [10,10,10]). Multiply each integer to calculate the tag cardinality. For example, `-t [100,20,4]` has a tag cardinality of 8000 unique series.

View File

@ -0,0 +1,43 @@
---
title: influx - InfluxDB command line interface
menu:
influxdb_1_8:
name: influx
weight: 10
parent: Tools
---
The `influx` command line interface (CLI) includes commands to manage many aspects of InfluxDB, including databases, organizations, users, and tasks.
## Usage
```
influx [flags]
```
## Flags
| Flag | Description |
|-------------------|-------------------------------------------------------------------------------------------------------|
| `-version` | Display the version and exit |
| `-url-prefix` | Path to add to the URL after the host and port. Specifies a custom endpoint to connect to. |
| `-host` | HTTP address of InfluxDB (default: `http://localhost:8086`) |
| `-port` | Port to connect to |
| `-socket` | Unix socket to connect to |
| `-database` | Database to connect to the server |
| `-password` | Password to connect to the server. Leaving blank will prompt for password (`--password ''`). |
| `-username` | Username to connect to the server |
| `-ssl` | Use https for requests |
| `-unsafessl` | Set this when connecting to the cluster using https |
| `-execute` | Execute command and quit |
| `-type` | Specify the query language for executing commands or when invoking the REPL. |
| `-format` | Specify the format of the server responses: json, csv, or column |
| `-precision` | Specify the format of the timestamp: rfc3339, h, m, s, ms, u or ns |
| `-consistency` | Set write consistency level: any, one, quorum, or all |
| `-pretty` | Turns on pretty print for JSON format |
| `-import` | Import a previous database export from file |
| `-pps` | Points per second the import will allow. The default is `0` and will not throttle importing. |
| `-path` | Path to file to import |
| `-compressed` | Set to true if the import file is compressed |

View File

@ -0,0 +1,578 @@
---
title: Influx Inspect disk utility
description: Use the "influx_inspect" commands to manage InfluxDB disks and shards.
menu:
influxdb_1_8:
weight: 50
parent: Tools
---
Influx Inspect is an InfluxDB disk utility that can be used to:
* View detailed information about disk shards.
* Export data from a shard to [InfluxDB line protocol](/influxdb/v1.8/concepts/glossary/#influxdb-line-protocol) that can be inserted back into the database.
* Convert TSM index shards to TSI index shards.
## `influx_inspect` utility
### Syntax
```
influx_inspect [ [ command ] [ options ] ]
```
`-help` is the default command and prints syntax and usage information for the tool.
### `influx_inspect` commands
The `influx_inspect` commands are summarized here, with links to detailed information on each of the commands.
* [`buildtsi`](#buildtsi): Converts in-memory (TSM-based) shards to TSI.
* [`deletetsm`](#deletetsm): Bulk deletes a measurement from a raw TSM file.
* [`dumptsi`](#dumptsi): Dumps low-level details about TSI files.
* [`dumptsm`](#dumptsm): Dumps low-level details about TSM files.
* [`dumptsmwal`](#dumptsmwal): Dump all data from a WAL file.
* [`export`](#export): Exports raw data from a shard in InfluxDB line protocol format.
* [`report`](#report): Displays a shard level report.
* [`reporttsi`](#reporttsi): Reports on cardinality for measurements and shards.
* [`verify`](#verify): Verifies the integrity of TSM files.
* [`verify-seriesfile`](#verify-seriesfile): Verifies the integrity of series files.
* [`verify-tombstone`](#verify-tombstone): Verifies the integrity of tombstones.
### `buildtsi`
Builds TSI (Time Series Index) disk-based shard index files and associated series files.
The index is written to a temporary location until complete and then moved to a permanent location.
If an error occurs, then this operation will fall back to the original in-memory index.
> ***Note:*** **For offline conversion only.**
> When TSI is enabled, new shards use the TSI indexes.
> Existing shards continue as TSM-based shards until
> converted offline.
##### Syntax
```
influx_inspect buildtsi -datadir <data_dir> -waldir <wal_dir> [ options ]
```
> **Note:** Use the `buildtsi` command with the user account that you are going to run the database as,
> or ensure that the permissions match after running the command.
#### Options
Optional arguments are in brackets.
##### `[ -batch-size ]`
The size of the batches written to the index. Default value is `10000`.
{{% warn %}}**Warning:** Setting this value can have adverse effects on performance and heap size.{{% /warn %}}
##### `[ -compact-series-file ]`
**Does not rebuild the index.** Compacts the existing series file, including offline series. Iterates series in each segment and rewrites non-tombstoned series in the index to a new .tmp file next to the segment. Once all segments are converted, the temporary files overwrite the original segments.
##### `[ -concurrency ]`
The number of workers to dedicate to shard index building.
Defaults to [`GOMAXPROCS`](/influxdb/v1.8/administration/config#gomaxprocs-environment-variable) value.
##### `[ -database <db_name> ]`
The name of the database.
##### `-datadir <data_dir>`
The path to the `data` directory.
##### `[ -max-cache-size ]`
The maximum size of the cache before it starts rejecting writes.
This value overrides the configuration setting for
`[data] cache-max-memory-size`.
Default value is `1073741824`.
##### `[ -max-log-file-size ]`
The maximum size of the log file. Default value is `1048576`.
##### `[ -retention <rp_name> ]`
The name of the retention policy.
##### `[ -shard <shard_ID> ]`
The identifier of the shard.
##### `[ -v ]`
Flag to enable output in verbose mode.
##### `-waldir <wal_dir>`
The directory for the WAL (Write Ahead Log) files.
#### Examples
##### Converting all shards on a node
```
$ influx_inspect buildtsi -datadir ~/.influxdb/data -waldir ~/.influxdb/wal
```
##### Converting all shards for a database
```
$ influx_inspect buildtsi -database mydb -datadir ~/.influxdb/data -waldir ~/.influxdb/wal
```
##### Converting a specific shard
```
$ influx_inspect buildtsi -database stress -shard 1 -datadir ~/.influxdb/data -waldir ~/.influxdb/wal
```
### `deletetsm`
Use `deletetsm -measurement` to delete a measurement in a raw TSM file (from specified shards).
Use `deletetsm -sanitize` to remove all tag and field keys containing non-printable Unicode characters in a raw TSM file (from specified shards).
{{% warn %}} **Warning:** Use the `deletetsm` command only when your InfluxDB instance is offline (`influxd` service is not running).{{% /warn %}}
#### Syntax
````
influx_inspect deletetsm -measurement <measurement_name> [ arguments ] <path>
````
##### `<path>`
Path to the `.tsm` file, located by default in the `data` directory.
When specifying the path, wildcards (`*`) can replace one or more characters.
#### Options
Either the `-measurement` or `-sanitize` flag is required.
##### `-measurement`
The name of the measurement to delete from TSM files.
##### `-sanitize`
Flag to remove all keys containing non-printable Unicode characters from TSM files.
##### `-v`
Optional. Flag to enable verbose logging.
#### Examples
##### Delete a measurement from a single shard
Delete the measurement `h2o_feet` from a single shard.
```
./influx_inspect deletetsm -measurement h2o_feet /influxdb/data/location/autogen/1384/*.tsm
```
##### Delete a measurement from all shards in the database
Delete the measurement `h2o_feet` from all shards in the database.
```
./influx_inspect deletetsm -measurement h2o_feet /influxdb/data/location/autogen/*/*.tsm
```
### `dumptsi`
Dumps low-level details about TSI files, including `.tsl` log files and `.tsi` index files.
#### Syntax
```
influx_inspect dumptsi [ options ] <index_path>
```
If no options are specified, summary statistics are provided for each file.
#### Options
Optional arguments are in brackets.
##### `-series-file <series_path>`
Path to the `_series` directory under the database `data` directory. Required.
##### [ `-series` ]
Dump raw series data.
##### [ `-measurements` ]
Dump raw [measurement](/influxdb/v1.8/concepts/glossary/#measurement) data.
##### [ `-tag-keys` ]
Dump raw [tag keys](/influxdb/v1.8/concepts/glossary/#tag-key).
##### [ `-tag-values` ]
Dump raw [tag values](/influxdb/v1.8/concepts/glossary/#tag-value).
##### [ `-tag-value-series` ]
Dump raw series for each tag value.
##### [ `-measurement-filter <regular_expression>` ]
Filter data by measurement regular expression.
##### [ `-tag-key-filter <regular_expression>` ]
Filter data by tag key regular expression.
##### [ `-tag-value-filter <regular_expresssion>` ]
Filter data by tag value regular expression.
#### Examples
##### Specifying paths to the `_series` and `index` directories
```
$ influx_inspect dumptsi -series-file /path/to/db/_series /path/to/index
```
##### Specifying paths to the `_series` directory and an `index` file
```
$ influx_inspect dumptsi -series-file /path/to/db/_series /path/to/index/file0
```
##### Specifying paths to the `_series` directory and multiple `index` files
```
$ influx_inspect dumptsi -series-file /path/to/db/_series /path/to/index/file0 /path/to/index/file1 ...
```
### `dumptsm`
Dumps low-level details about [TSM](/influxdb/v1.8/concepts/glossary/#tsm-time-structured-merge-tree) files, including TSM (`.tsm`) files and WAL (`.wal`) files.
#### Syntax
```
influx_inspect dumptsm [ options ] <path>
```
##### `<path>`
Path to the `.tsm` file, located by default in the `data` directory.
#### Options
Optional arguments are in brackets.
##### [ `-index` ]
Flag to dump raw index data.
Default value is `false`.
##### [ `-blocks` ]
Flag to dump raw block data.
Default value is `false`.
##### [ `-all` ]
Flag to dump all data. Caution: This may print a lot of information.
Default value is `false`.
##### [ `-filter-key <key_name>` ]
Display only index data and block data that match this key substring.
Default value is `""`.
### `dumptsmwal`
Dumps all entries from one or more WAL (`.wal`) files only and excludes TSM (`.tsm`) files.
#### Syntax
```
influx_inspect dumptsmwal [ options ] <wal_dir>
```
#### Options
Optional arguments are in brackets.
##### [ `-show-duplicates` ]
Flag to show keys which have duplicate or out-of-order timestamps.
If a user writes points with timestamps set by the client, then multiple points with the same timestamp (or with time-descending timestamps) can be written.
### `export`
Exports all TSM files in InfluxDB line protocol data format.
Writes all WAL file data for `_internal/monitor`.
This output file can be imported using the
[influx](/influxdb/v1.8/tools/shell/#import-data-from-a-file-with-import) command.
#### Syntax
```
influx_inspect export [ options ]
```
#### Options
Optional arguments are in brackets.
##### [ `-compress` ]
The flag to compress the output using gzip compression.
Default value is `false`.
##### [ `-database <db_name>` ]
The name of the database to export.
Default value is `""`.
##### `-datadir <data_dir>`
The path to the `data` directory.
Default value is `"$HOME/.influxdb/data"`.
##### [ `-end <timestamp>` ]
The timestamp for the end of the time range. Must be in [RFC3339 format](https://tools.ietf.org/html/rfc3339).
RFC3339 requires very specific formatting. For example, to indicate no time zone offset (UTC+0), you must include Z or +00:00 after seconds. Examples of valid RFC3339 formats include:
**No offset**
```
YYYY-MM-DDTHH:MM:SS+00:00
YYYY-MM-DDTHH:MM:SSZ
YYYY-MM-DDTHH:MM:SS.nnnnnnZ (fractional seconds (.nnnnnn) are optional)
```
**With offset**
```
YYYY-MM-DDTHH:MM:SS-08:00
YYYY-MM-DDTHH:MM:SS+07:00
```
> **Note:** With offsets, avoid replacing the + or - sign with a Z. It may cause an error or print Z (ISO 8601 behavior) instead of the time zone offset.
##### [ `-out <export_dir>` ]
The location for the export file.
Default value is `"$HOME/.influxdb/export"`.
##### [ `-retention <rp_name> ` ]
The name of the [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) to export. Default value is `""`.
##### [ `-start <timestamp>` ]
The timestamp for the start of the time range.
The timestamp string must be in [RFC3339 format](https://tools.ietf.org/html/rfc3339).
##### [ `-waldir <wal_dir>` ]
Path to the [WAL](/influxdb/v1.8/concepts/glossary/#wal-write-ahead-log) directory.
Default value is `"$HOME/.influxdb/wal"`.
#### Examples
##### Export all databases and compress the output
```bash
influx_inspect export -compress
```
##### Export data from a specific database and retention policy
```bash
influx_inspect export -database mydb -retention autogen
```
##### Output file
```bash
# DDL
CREATE DATABASE MY_DB_NAME
CREATE RETENTION POLICY autogen ON MY_DB_NAME DURATION inf REPLICATION 1
# DML
# CONTEXT-DATABASE:MY_DB_NAME
# CONTEXT-RETENTION-POLICY:autogen
randset value=97.9296104805 1439856000000000000
randset value=25.3849066842 1439856100000000000
```
### `report`
Displays series metadata for all shards.
The default location is `$HOME/.influxdb`.
#### Syntax
```
influx_inspect report [ options ]
```
#### Options
Optional arguments are in brackets.
##### [ `-pattern "<regular expression/wildcard>"` ]
The regular expression or wildcard pattern to match included files.
Default value is `""`.
##### [ `-detailed` ]
The flag to report detailed cardinality estimates.
Default value is `false`.
##### [ `-exact` ]
The flag to report exact cardinality counts instead of estimates.
Default value is `false`.
Note: This can use a lot of memory.
### `reporttsi`
The report does the following:
* Calculates the total exact series cardinality in the database.
* Segments that cardinality by measurement, and emits those cardinality values.
* Emits total exact cardinality for each shard in the database.
* Segments for each shard the exact cardinality for each measurement in the shard.
* Optionally limits the results in each shard to the "top n".
The `reporttsi` command is primarily useful when there has been a change in cardinality
and it's not clear which measurement is responsible for this change, and further, _when_
that change happened. Estimating an accurate cardinality breakdown for each measurement
and for each shard will help answer those questions.
### Syntax
```
influx_inspect reporttsi -db-path <path-to-db> [ options ]
```
#### Options
Optional arguments are in brackets.
##### `-db-path <path-to-db>`
The path to the database.
##### [ `-top <n>` ]
Limits the results to the top specified number within each shard.
#### Performance
The `reporttsi` command uses simple slice/maps to store low cardinality measurements, which saves on the cost of initializing bitmaps.
For high cardinality measurements the tool uses [roaring bitmaps](https://roaringbitmap.org/), which means we don't need to store all series IDs on the heap while running the tool.
Conversion from low-cardinality to high-cardinality representations is done automatically while the tool runs.
### `verify`
Verifies the integrity of TSM files.
#### Syntax
```
influx_inspect verify [ options ]
```
#### Options
Optional arguments are in brackets.
##### `-dir <storage_root>`
The path to the storage root directory.
Default value is `"/root/.influxdb"`.
### `verify-seriesfile`
Verifies the integrity of series files.
#### Syntax
```
influx_inspect verify-seriesfile [ options ]
```
#### Options
Optional arguments are in brackets.
##### [ `-c <number>` ]
Specifies the number of concurrent workers to run for this command. Default is equal to the value of GOMAXPROCS. If performance is adversely impacted, you can set a lower value.
##### [ `-dir <path>` ]
Specifies the root data path. Defaults to `~/.influxdb/data`.
##### [ `-db <db_name>` ]
Restricts verifying series files to the specified database in the data directory.
##### [ `-series-file <path>` ]
Path to a specific series file; overrides `-db` and `-dir`.
##### [ `-v` ]
Enables verbose logging.
### `verify-tombstone`
Verifies the integrity of tombstones.
#### Syntax
```
influx_inspect verify-tombstone [ options ]
```
Finds and verifies all tombstones under the specified directory path (by default, `~/.influxdb/data`). Files are verified serially.
#### Options
Optional arguments are in brackets.
##### [ `-dir <path>` ]
Specifies the root data path. Defaults to `~/.influxdb/data`. This path can be arbitrary, for example, it doesn't need to be an InfluxDB data directory.
##### [ `-v` ]
Enables verbose logging. Confirms a file is being verified and displays progress every 5 million tombstone entries.
##### [ `-vv` ]
Enables very verbose logging. Displays progress for every series key and time range in the tombstone files. Timestamps are displayed in nanoseconds since the Epoch (`1970-01-01T00:00:00Z`).
##### [ `-vvv` ]
Enables very very verbose logging. Displays progress for every series key and time range in the tombstone files. Timestamps are displayed in [RFC3339 format](https://tools.ietf.org/html/rfc3339) with nanosecond precision.
> **Note on verbose logging:** Higher verbosity levels override lower levels.
## Caveats
The system does not have access to the metastore when exporting TSM shards.
As such, it always creates the [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) with infinite duration and replication factor of 1. End users may want to change this prior to reimporting if they are importing to a cluster or want a different duration
for retention.

View File

@ -0,0 +1,27 @@
---
title: influxd - InfluxDB daemon
description: The influxd daemon starts and runs all the processes necessary for InfluxDB to function.
menu:
influxdb_1_8:
name: influxd
weight: 10
parent: Tools
---
The `influxd` command starts and runs all the processes necessary for InfluxDB to function.
## Usage
```
influxd [[command] [arguments]]
```
## Commands
| Command | Description |
|-------------------------------------------------------|----------------------------------------------------------|
| [backup](/influxdb/latest/tools/influxd/backup) | Download a snapshot of a data node and saves it to disk. |
| [config](/influxdb/latest/tools/influxd/config) | Display the default configuration. |
| help | Display the help message. |
| [restore](/influxdb/latest/tools/influxd/restore) | Use a snapshot of a data node to rebuild a cluster. |
| [run](/influxdb/latest/tools/influxd/run) | Run node with existing configuration. |
| [version](/influxdb/latest/tools/influxd/version) | Display the InfluxDB version. |

View File

@ -0,0 +1,31 @@
---
title: influxd backup
description: The `influxd backup` command restores backup data and metadata from an InfluxDB backup directory.
menu:
influxdb_1_8:
name: influxd backup
weight: 10
parent: influxd
---
The `influxd backup` command crates a backup copy of specified InfluxDB OSS database(s) and saves the files in an Enterprise-compatible format to PATH (directory where backups are saved).
## Usage
```
influxd backup [flags] PATH
```
## Flags
| Flag | Description |
|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `-portable` | Generate backup files in a portable format that can be restored to InfluxDB OSS or InfluxDB Enterprise. Use unless the legacy backup is required. |
| `-host` | InfluxDB OSS host to back up from. Optional. Defaults to 127.0.0.1:8088. |
| `-db` | InfluxDB OSS database name to back up. Optional. If not specified, all databases are backed up when using '-portable'. |
| `-rp` | Retention policy to use for the backup. Optional. If not specified, all retention policies are used by default. |
| `-shard` | The identifier of the shard to back up. Optional. If specified, '-rp ' is required. |
| `-start` | Include all points starting with specified timestamp (RFC3339 format). Not compatible with '-since '. |
| `-end` | Exclude all points after timestamp (RFC3339 format). Not compatible with '-since '. |
| `-since` | Create an incremental backup of all points after the timestamp (RFC3339 format). Optional. Recommend using '-start ' instead. |
| `-skip-errors` | Continue backing up the remaining shards when the current shard fails to backup. |

View File

@ -0,0 +1,23 @@
---
title: influxd config
description: The `influxd config` command displays the default configuration.
menu:
influxdb_1_8:
name: influxd config
weight: 10
parent: influxd
---
The `influxd config` command displays the default configuration.
## Usage
```
influxd config [flags]
```
## Flags
| Flag | Description | Maps To |
|---------------|--------------------------------------------------------------------------------------------------------------------------------------------|------------------------|
| `-config` | Set the path to the configuration file. Disable the automatic loading of a configuration file using the null device (such as `/dev/null`). | `INFLUXDB_CONFIG_PATH` |
| `-h`, `-help` | Help for the `influxd config` command. | |

View File

@ -0,0 +1,30 @@
---
title: influxd restore
description: The `influxd restore` command restores backup data and metadata from an InfluxDB backup directory.
menu:
influxdb_1_8:
name: influxd restore
weight: 10
parent: influxd
---
The `influxd restore` command restores backup data and metadata from an InfluxDB backup directory.
Shut down the `influxd` server before restoring data.
## Usage
```
influxd restore [flags]
```
## Flags
| Flag | Description | Maps To |
|-------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------|
| `-portable` | Activate portable restore mode, which consumes files in an improved Enterprise-compatible format that includes a manifest. If not specified, the legacy restore mode is used. | `INFLUXDB_CONFIG_PATH` |
| `-host` | InfluxDB OSS host to connect to where the data will be restored | |
| `-db` | Name of database to be restored from the backup (InfluxDB OSS or InfluxDB Enterprise). | |
| `-newdb` |Name of the InfluxDB OSS database to import archived data into. Optional. If not specified, then the value of `-db <db_name>` is used. The new database name must be unique to the target system. | |
| `-rp` | Name of retention policy to restore. Optional. Requires that `db` is specified. | |
| `-newrp` | Name of retention policy to restore to. Optional. Requires that `-rp` is specified. | |
| `-shard` | Shard ID to restore. Optional. Requires that `-db` and `-rp` are specified.

View File

@ -0,0 +1,34 @@
---
title: influxd run
description: The `influxd run` command starts and runs all the processes necessary for InfluxDB to function.
menu:
influxdb_1_8:
name: influxd run
weight: 10
parent: influxd
---
The `influxd run` command is the default command for `influxd`.
It starts and runs all the processes necessary for InfluxDB to function.
## Usage
```
influxd run [flags]
```
Because `run` is the default command for `influxd`, the following commands are the same:
```bash
influxd
influxd run
```
## Flags
| Flag | Description |
|---------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `-config` | Path to the configuration file. This defaults to the environment variable `INFLUXDB_CONFIG_PATH`, `~/.influxdb/influxdb.conf`, or `/etc/influxdb/influxdb.conf` if a file is present at either of these locations. Disable the automatic loading of a configuration file using the null device (such as `/dev/null`). |
| `-pidfile` | Write process ID to a file. |
| `-cpuprofile` | Write CPU profiling information to a file. |
| `-memprofile` | Write memory usage information to a file. |

View File

@ -0,0 +1,24 @@
---
title: influxd version
description: The `influxd version` command outputs the current version of InfluxDB.
menu:
influxdb_1_8:
name: influxd version
weight: 10
parent: influxd
---
The `influxd version` command outputs the current version of InfluxDB.
## Usage
```
influxd version [flags]
```
## Flags
| Flag | Description |
|:---- |:----------- |
| `-h`, `-help` | Help for the `version` command |

View File

@ -0,0 +1,397 @@
---
title: Using influx - InfluxDB command line interface
menu:
influxdb_1_8:
name: Using influx
weight: 10
parent: Guides
---
InfluxDB's command line interface (`influx`) is an interactive shell for the HTTP API.
Use `influx` to write data (manually or from a file), query data interactively, and view query output in different formats.
* [Launch `influx`](/influxdb/v1.8/tools/shell/#launch-influx)
* [`influx` Arguments](/influxdb/v1.8/tools/shell/#influx-arguments)
* [`influx` Commands](/influxdb/v1.8/tools/shell/#influx-commands)
## Launch `influx`
If you [install](https://influxdata.com/downloads/) InfluxDB via a package manager, the CLI is installed at `/usr/bin/influx` (`/usr/local/bin/influx` on macOS).
To access the CLI, first launch the `influxd` database process and then launch `influx` in your terminal.
Once you've entered the shell and successfully connected to an InfluxDB node, you'll see the following output:
<br>
<br>
```bash
$ influx
Connected to http://localhost:8086 version 1.8.x
InfluxDB shell version: 1.8.x
```
> **Note:** The versions of InfluxDB and the CLI should be identical. If not, parsing issues can occur with queries.
You can now enter InfluxQL queries as well as some CLI-specific commands directly in your terminal.
You can use `help` at any time to get a list of available commands. Use `Ctrl+C` to cancel if you want to cancel a long-running InfluxQL query.
## Environment Variables
The following environment variables can be used to configure settings used by the `influx` client. They can be specified in lower or upper case, however the upper case version takes precedence.
#### `HTTP_PROXY`
Defines the proxy server to use for HTTP.
**Value format:**`[protocol://]<host>[:port]`
```
HTTP_PROXY=http://localhost:1234
```
#### `HTTPS_PROXY`
Defines the proxy server to use for HTTPS. Takes precedence over HTTP_PROXY for HTTPS.
**Value format:**`[protocol://]<host>[:port]`
```
HTTPS_PROXY=https://localhost:1443
```
#### `NO_PROXY`
List of host names that should **not** go through any proxy. If set to an asterisk '\*' only, it matches all hosts.
**Value format:** comma-separated list of hosts
```
NO_PROXY=123.45.67.89,123.45.67.90
```
## `influx` Arguments
There are several arguments you can pass into `influx` when starting.
List them with `$ influx --help`.
The list below offers a brief discussion of each option.
We provide detailed information on `-execute`, `-format`, and `-import` at the end of this section.
`-compressed`
Set to true if the import file is compressed.
Use with `-import`.
`-consistency 'any|one|quorum|all'`
Set the write consistency level.
`-database 'database name'`
The database to which `influx` connects.
`-execute 'command'`
Execute an [InfluxQL](/influxdb/v1.8/query_language/data_exploration/) command and quit.
See [-execute](/influxdb/v1.8/tools/shell/#execute-an-influxql-command-and-quit-with-execute).
`-format 'json|csv|column'`
Specifies the format of the server responses.
See [-format](/influxdb/v1.8/tools/shell/#specify-the-format-of-the-server-responses-with-format).
`-host 'host name'`
The host to which `influx` connects.
By default, InfluxDB runs on localhost.
`-import`
Import new data from a file or import a previously [exported](https://github.com/influxdb/influxdb/blob/1.8/importer/README.md) database from a file.
See [-import](/influxdb/v1.8/tools/shell/#import-data-from-a-file-with-import).
`-password 'password'`
The password `influx` uses to connect to the server.
`influx` will prompt for a password if you leave it blank (`-password ''`).
Alternatively, set the password for the CLI with the `INFLUX_PASSWORD` environment
variable.
`-path`
The path to the file to import.
Use with `-import`.
`-port 'port #'`
The port to which `influx` connects.
By default, InfluxDB runs on port `8086`.
`-pps`
How many points per second the import will allow.
By default, pps is zero and influx will not throttle importing.
Use with `-import`.
`-precision 'rfc3339|h|m|s|ms|u|ns'`
Specifies the format/precision of the timestamp: `rfc3339` (`YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ`), `h` (hours), `m` (minutes), `s` (seconds), `ms` (milliseconds), `u` (microseconds), `ns` (nanoseconds).
Precision defaults to nanoseconds.
> **Note:** Setting the precision to `rfc3339` (`-precision rfc3339`) works with the `-execute` option, but it does not work with the `-import option`. All other precision formats (e.g., `h`,`m`,`s`,`ms`,`u`, and `ns`) work with the `-execute` and `-import` options.
`-pretty`
Turns on pretty print for the `json` format.
`-ssl`
Use HTTPS for requests.
`-unsafeSsl`
Disables SSL certificate verification.
Use when connecting over HTTPS with a self-signed certificate.
`-username 'username'`
The username that `influx` uses to connect to the server.
Alternatively, set the username for the CLI with the `INFLUX_USERNAME` environment variable.
`-version`
Display the InfluxDB version and exit.
### Execute an InfluxQL command and quit with `-execute`
Execute queries that don't require a database specification:
```bash
$ influx -execute 'SHOW DATABASES'
name: databases
---------------
name
NOAA_water_database
_internal
telegraf
pirates
```
Execute queries that do require a database specification, and change the timestamp precision:
```bash
$ influx -execute 'SELECT * FROM "h2o_feet" LIMIT 3' -database="NOAA_water_database" -precision=rfc3339
name: h2o_feet
--------------
time level description location water_level
2015-08-18T00:00:00Z below 3 feet santa_monica 2.064
2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12
2015-08-18T00:06:00Z between 6 and 9 feet coyote_creek 8.005
```
### Specify the format of the server responses with `-format`
The default format is `column`:
```bash
$ influx -format=column
[...]
> SHOW DATABASES
name: databases
---------------
name
NOAA_water_database
_internal
telegraf
pirates
```
Change the format to `csv`:
```bash
$ influx -format=csv
[...]
> SHOW DATABASES
name,name
databases,NOAA_water_database
databases,_internal
databases,telegraf
databases,pirates
```
Change the format to `json`:
```bash
$ influx -format=json
[...]
> SHOW DATABASES
{"results":[{"series":[{"name":"databases","columns":["name"],"values":[["NOAA_water_database"],["_internal"],["telegraf"],["pirates"]]}]}]}
```
Change the format to `json` and turn on pretty print:
```bash
$ influx -format=json -pretty
[...]
> SHOW DATABASES
{
"results": [
{
"series": [
{
"name": "databases",
"columns": [
"name"
],
"values": [
[
"NOAA_water_database"
],
[
"_internal"
],
[
"telegraf"
],
[
"pirates"
]
]
}
]
}
]
}
```
### Import data from a file with `-import`
The import file has two sections:
* **DDL (Data Definition Language)**: Contains the [InfluxQL commands](/influxdb/v1.8/query_language/database_management/) for creating the relevant [database](/influxdb/v1.8/concepts/glossary/) and managing the [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp).
If your database and retention policy already exist, your file can skip this section.
* **DML (Data Manipulation Language)**: Lists the relevant database and (if desired) retention policy and contains the data in [line protocol](/influxdb/v1.8/concepts/glossary/#line-protocol).
Example:
File (`datarrr.txt`):
```
# DDL
CREATE DATABASE pirates
CREATE RETENTION POLICY oneday ON pirates DURATION 1d REPLICATION 1
# DML
# CONTEXT-DATABASE: pirates
# CONTEXT-RETENTION-POLICY: oneday
treasures,captain_id=dread_pirate_roberts value=801 1439856000
treasures,captain_id=flint value=29 1439856000
treasures,captain_id=sparrow value=38 1439856000
treasures,captain_id=tetra value=47 1439856000
treasures,captain_id=crunch value=109 1439858880
```
Command:
```
$influx -import -path=datarrr.txt -precision=s
```
Results:
```
2015/12/22 12:25:06 Processed 2 commands
2015/12/22 12:25:06 Processed 5 inserts
2015/12/22 12:25:06 Failed 0 inserts
```
> **Note:** For large datasets, `influx` writes out a status message every 100,000 points.
For example:
>
2015/08/21 14:48:01 Processed 3100000 lines.
Time elapsed: 56.740578415s.
Points per second (PPS): 54634
Things to note about `-import`:
* Allow the database to ingest points by using `-pps` to set the number of points per second allowed by the import. By default, pps is zero and `influx` does not throttle importing.
* Imports work with `.gz` files, just include `-compressed` in the command.
* Include timestamps in the data file. InfluxDB will assign the same timestamp to points without a timestamp. This can lead to unintended [overwrite behavior](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points).
* If your data file has more than 5,000 points, it may be necessary to split that file into several files in order to write your data in batches to InfluxDB.
We recommend writing points in batches of 5,000 to 10,000 points.
Smaller batches, and more HTTP requests, will result in sub-optimal performance.
By default, the HTTP request times out after five seconds.
InfluxDB will still attempt to write the points after that time out but there will be no confirmation that they were successfully written.
> **Note:** For how to export data from InfluxDB version 0.8.9, see [Exporting from 0.8.9](https://github.com/influxdb/influxdb/blob/1.8/importer/README.md).
## `influx` commands
Enter `help` in the CLI for a partial list of the available commands.
### Commands
The list below offers a brief discussion of each command.
We provide detailed information on `insert` at the end of this section.
`auth`
Prompts you for your username and password.
`influx` uses those credentials when querying a database.
Alternatively, set the username and password for the CLI with the
`INFLUX_USERNAME` and `INFLUX_PASSWORD` environment variables.
`chunked`
Turns on chunked responses from the server when issuing queries.
This setting is enabled by default.
`chunk size <size>`
Sets the size of the chunked responses.
The default size is `10,000`.
Setting it to `0` resets `chunk size` to its default value.
`clear [ database | db | retention policy | rp ]`
Clears the current context for the [database](/influxdb/v1.8/concepts/glossary/#database) or [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp).
`connect <host:port>`
Connect to a different server without exiting the shell.
By default, `influx` connects to `localhost:8086`.
If you do not specify either the host or the port, `influx` assumes the default setting for the missing attribute.
`consistency <level>`
Sets the write consistency level: `any`, `one`, `quorum`, or `all`.
`Ctrl+C`
Terminates the currently running query. Useful when an interactive query is taking too long to respond
because it is trying to return too much data.
`exit` `quit` `Ctrl+D`
Quits the `influx` shell.
`format <format>`
Specifies the format of the server responses: `json`, `csv`, or `column`.
See the description of [-format](/influxdb/v1.8/tools/shell/#specify-the-format-of-the-server-responses-with-format) for examples of each format.
`history`
Displays your command history.
To use the history while in the shell, simply use the "up" arrow.
`influx` stores your last 1,000 commands in your home directory in `.influx_history`.
`insert`
Write data using line protocol.
See [insert](/influxdb/v1.8/tools/shell/#write-data-to-influxdb-with-insert).
`precision <format>`
Specifies the format/precision of the timestamp: `rfc3339` (`YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ`), `h` (hours), `m` (minutes), `s` (seconds), `ms` (milliseconds), `u` (microseconds), `ns` (nanoseconds).
Precision defaults to nanoseconds.
`pretty`
Turns on pretty print for the `json` format.
`settings`
Outputs the current settings for the shell including the `Host`, `Username`, `Database`, `Retention Policy`, `Pretty` status, `Chunked` status, `Chunk Size`, `Format`, and `Write Consistency`.
`use [ "<database_name>" | "<database_name>"."<retention policy_name>" ]`
Sets the current [database](/influxdb/v1.8/concepts/glossary/#database) and/or [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp).
Once `influx` sets the current database and/or retention policy, there is no need to specify that database and/or retention policy in queries.
If you do not specify the retention policy, `influx` automatically queries the `use`d database's `DEFAULT` retention policy.
#### Write data to InfluxDB with `insert`
Enter `insert` followed by the data in [line protocol](/influxdb/v1.8/concepts/glossary/#line-protocol) to write data to InfluxDB.
Use `insert into <retention policy> <line protocol>` to write data to a specific [retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp).
Write data to a single field in the measurement `treasures` with the tag `captain_id = pirate_king`.
`influx` automatically writes the point to the database's `DEFAULT` retention policy.
```
> INSERT treasures,captain_id=pirate_king value=2
>
```
Write the same point to the already-existing retention policy `oneday`:
```
> INSERT INTO oneday treasures,captain_id=pirate_king value=2
Using retention policy oneday
>
```
### Queries
Execute all InfluxQL queries in `influx`.
See [Data exploration](/influxdb/v1.8/query_language/data_exploration/), [Schema exploration](/influxdb/v1.8/query_language/schema_exploration/), [Database management](/influxdb/v1.8/query_language/database_management/), [Authentication and authorization](/influxdb/v1.8/administration/authentication_and_authorization/) for InfluxQL documentation.

View File

@ -0,0 +1,20 @@
---
title: Troubleshooting InfluxDB
menu:
influxdb_1_8:
name: Troubleshooting
weight: 110
---
## [Frequently asked questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/)
This page addresses frequent sources of confusion and places where InfluxDB behaves in an unexpected way relative to other database systems.
Where applicable, it links to outstanding issues on GitHub.
## [Query management](/influxdb/v1.8/troubleshooting/query_management/)
With InfluxDBs query management features, users are able to identify currently-running queries and have the ability to kill queries that are overloading their system. Additionally, users can prevent and halt the execution of inefficient queries with several configuration settings.
## [Error messages](/influxdb/v1.8/troubleshooting/errors/)
This page includes information about some of the more common InfluxDB error messages, their descriptions, and, where applicable, common resolutions.

View File

@ -0,0 +1,407 @@
---
title: InfluxDB error messages
description: Covers InfluxDB error messages, their descriptions, and common resolutions.
menu:
influxdb_1_8:
name: Error messages
weight: 30
parent: Troubleshooting
---
This page documents errors, their descriptions, and, where applicable,
common resolutions.
{{% warn %}}
**Disclaimer:** This document does not contain an exhaustive list of all possible InfluxDB errors.
{{% /warn %}}
## `error: database name required`
The `database name required` error occurs when certain `SHOW` queries do
not specify a [database](/influxdb/v1.8/concepts/glossary/#database).
Specify a database with an `ON` clause in the `SHOW` query, with `USE <database_name>` in the
[CLI](/influxdb/v1.8/tools/shell/), or with the `db` query string parameter in
the [InfluxDB API](/influxdb/v1.8/tools/api/#query-string-parameters) request.
The relevant `SHOW` queries include `SHOW RETENTION POLICIES`, `SHOW SERIES`,
`SHOW MEASUREMENTS`, `SHOW TAG KEYS`, `SHOW TAG VALUES`, and `SHOW FIELD KEYS`.
**Resources:**
[Schema exploration](/influxdb/v1.8/query_language/schema_exploration/),
[InfluxQL reference](/influxdb/v1.8/query_language/spec/)
## `error: max series per database exceeded: < >`
The `max series per database exceeded` error occurs when a write causes the
number of [series](/influxdb/v1.8/concepts/glossary/#series) in a database to
exceed the maximum allowable series per database.
The maximum allowable series per database is controlled by the
`max-series-per-database` setting in the `[data]` section of the configuration
file.
The information in the `< >` shows the measurement and the tag set of the series
that exceeded `max-series-per-database`.
By default `max-series-per-database` is set to one million.
Changing the setting to `0` allows an unlimited number of series per database.
**Resources:**
[Database Configuration](/influxdb/v1.8/administration/config/#max-series-per-database-1000000)
## `error parsing query: found < >, expected identifier at line < >, char < >`
### InfluxQL syntax
The `expected identifier` error occurs when InfluxDB anticipates an identifier
in a query but doesn't find it.
Identifiers are tokens that refer to continuous query names, database names,
field keys, measurement names, retention policy names, subscription names,
tag keys, and user names.
The error is often a gentle reminder to double-check your query's syntax.
**Examples**
*Query 1:*
```sql
> CREATE CONTINUOUS QUERY ON "telegraf" BEGIN SELECT mean("usage_idle") INTO "average_cpu" FROM "cpu" GROUP BY time(1h),"cpu" END
ERR: error parsing query: found ON, expected identifier at line 1, char 25
```
Query 1 is missing a continuous query name between `CREATE CONTINUOUS QUERY` and
`ON`.
*Query 2:*
```sql
> SELECT * FROM WHERE "blue" = true
ERR: error parsing query: found WHERE, expected identifier at line 1, char 15
```
Query 2 is missing a measurement name between `FROM` and `WHERE`.
### InfluxQL keywords
In some cases the `expected identifier` error occurs when one of the
[identifiers](/influxdb/v1.8/concepts/glossary/#identifier) in the query is an
[InfluxQL Keyword](/influxdb/v1.8/query_language/spec/#keywords).
To successfully query an identifier that's also a keyword, enclose that
identifier in double quotes.
**Examples**
*Query 1:*
```sql
> SELECT duration FROM runs
ERR: error parsing query: found DURATION, expected identifier, string, number, bool at line 1, char 8
```
In Query 1, the field key `duration` is an InfluxQL Keyword.
Double quote `duration` to avoid the error:
```sql
> SELECT "duration" FROM runs
```
*Query 2:*
```sql
> CREATE RETENTION POLICY limit ON telegraf DURATION 1d REPLICATION 1
ERR: error parsing query: found LIMIT, expected identifier at line 1, char 25
```
In Query 2, the retention policy name `limit` is an InfluxQL Keyword.
Double quote `limit` to avoid the error:
```sql
> CREATE RETENTION POLICY "limit" ON telegraf DURATION 1d REPLICATION 1
```
While using double quotes is an acceptable workaround, we recommend that you avoid using InfluxQL keywords as identifiers for simplicity's sake.
**Resources:**
[InfluxQL Keywords](/influxdb/v1.8/query_language/spec/#keywords),
[Query Language Documentation](/influxdb/v1.8/query_language/)
## `error parsing query: found < >, expected string at line < >, char < >`
The `expected string` error occurs when InfluxDB anticipates a string
but doesn't find it.
In most cases, the error is a result of forgetting to quote the password
string in the `CREATE USER` statement.
**Example**
```sql
> CREATE USER penelope WITH PASSWORD timeseries4dayz
ERR: error parsing query: found timeseries4dayz, expected string at line 1, char 36
```
The `CREATE USER` statement requires single quotation marks around the password
string:
```sql
> CREATE USER penelope WITH PASSWORD 'timeseries4dayz'
```
Note that you should not include the single quotes when authenticating requests.
**Resources:**
[Authentication and Authorization](/influxdb/v1.8/administration/authentication_and_authorization/)
## `error parsing query: mixing aggregate and non-aggregate queries is not supported`
The `mixing aggregate and non-aggregate` error occurs when a `SELECT` statement
includes both an [aggregate function](/influxdb/v1.8/query_language/functions/)
and a standalone [field key](/influxdb/v1.8/concepts/glossary/#field-key) or
[tag key](/influxdb/v1.8/concepts/glossary/#tag-key).
Aggregate functions return a single calculated value and there is no obvious
single value to return for any unaggregated fields or tags.
**Example**
*Raw data:*
The `peg` measurement has two fields (`square` and `round`) and one tag
(`force`):
```sql
name: peg
---------
time square round force
2016-10-07T18:50:00Z 2 8 1
2016-10-07T18:50:10Z 4 12 2
2016-10-07T18:50:20Z 6 14 4
2016-10-07T18:50:30Z 7 15 3
```
*Query 1:*
```sql
> SELECT mean("square"),"round" FROM "peg"
ERR: error parsing query: mixing aggregate and non-aggregate queries is not supported
```
Query 1 includes an aggregate function and a standalone field.
`mean("square")` returns a single aggregated value calculated from the four values
of `square` in the `peg` measurement, and there is no obvious single field value
to return from the four unaggregated values of the `round` field.
*Query 2:*
```sql
> SELECT mean("square"),"force" FROM "peg"
ERR: error parsing query: mixing aggregate and non-aggregate queries is not supported
```
Query 2 includes an aggregate function and a standalone tag.
`mean("square")` returns a single aggregated value calculated from the four values
of `square` in the `peg` measurement, and there is no obvious single tag value
to return from the four unaggregated values of the `force` tag.
**Resources:**
[Functions](/influxdb/v1.8/query_language/functions/)
## `invalid operation: time and \*influxql.VarRef are not compatible`
The `time and \*influxql.VarRef are not compatible` error occurs when
date-time strings are double quoted in queries.
Date-time strings require single quotes.
### Examples
Double quoted date-time strings:
```sql
> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= "2015-08-18T00:00:00Z" AND time <= "2015-08-18T00:12:00Z"
ERR: invalid operation: time and *influxql.VarRef are not compatible
```
Single quoted date-time strings:
```sql
> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:12:00Z'
name: h2o_feet
time water_level
---- -----------
2015-08-18T00:00:00Z 2.064
2015-08-18T00:06:00Z 2.116
2015-08-18T00:12:00Z 2.028
```
**Resources:**
[Data Exploration](/influxdb/v1.8/query_language/data_exploration/#time-syntax)
## `unable to parse < >: bad timestamp`
### Timestamp syntax
The `bad timestamp` error occurs when the
[line protocol](/influxdb/v1.8/concepts/glossary/#influxdb-line-protocol) includes a
timestamp in a format other than a UNIX timestamp.
**Example**
```sql
> INSERT pineapple value=1 '2015-08-18T23:00:00Z'
ERR: {"error":"unable to parse 'pineapple value=1 '2015-08-18T23:00:00Z'': bad timestamp"}
```
The line protocol above uses an [RFC3339](https://www.ietf.org/rfc/rfc3339.txt)
timestamp.
Replace the timestamp with a UNIX timestamp to avoid the error and successfully
write the point to InfluxDB:
```sql
> INSERT pineapple,fresh=true value=1 1439938800000000000
```
### InfluxDB line protocol syntax
In some cases, the `bad timestamp` error occurs with more general syntax errors
in the InfluxDB line protocol.
Line protocol is whitespace sensitive; misplaced spaces can cause InfluxDB
to assume that a field or tag is an invalid timestamp.
**Example**
*Write 1*
```sql
> INSERT hens location=2 value=9
ERR: {"error":"unable to parse 'hens location=2 value=9': bad timestamp"}
```
The line protocol in Write 1 separates the `hen` measurement from the `location=2`
tag with a space instead of a comma.
InfluxDB assumes that the `value=9` field is the timestamp and returns an error.
Use a comma instead of a space between the measurement and tag to avoid the error:
```sql
> INSERT hens,location=2 value=9
```
*Write 2*
```sql
> INSERT cows,name=daisy milk_prod=3 happy=3
ERR: {"error":"unable to parse 'cows,name=daisy milk_prod=3 happy=3': bad timestamp"}
```
The line protocol in Write 2 separates the `milk_prod=3` field and the
`happy=3` field with a space instead of a comma.
InfluxDB assumes that the `happy=3` field is the timestamp and returns an error.
Use a comma instead of a space between the two fields to avoid the error:
```sql
> INSERT cows,name=daisy milk_prod=3,happy=3
```
**Resources:**
[InfluxDB line protocol tutorial](/influxdb/v1.8/write_protocols/line_protocol_tutorial/),
[InfluxDB line protocol reference](/influxdb/v1.8/write_protocols/line_protocol_reference/)
## `unable to parse < >: time outside range`
The `time outside range` error occurs when the timestamp in the
[InfluxDB line protocol](/influxdb/v1.8/concepts/glossary/#influxdb-line-protocol)
falls outside the valid time range for InfluxDB.
The minimum valid timestamp is `-9223372036854775806` or `1677-09-21T00:12:43.145224194Z`.
The maximum valid timestamp is `9223372036854775806` or `2262-04-11T23:47:16.854775806Z`.
**Resources:**
[InfluxDB line protocol tutorial](/influxdb/v1.8/write_protocols/line_protocol_tutorial/#data-types),
[InfluxDB line protocol reference](/influxdb/v1.8/write_protocols/line_protocol_reference/#data-types)
## write failed for shard < >: engine: cache maximum memory size exceeded
The `cache maximum memory size exceeded` error occurs when the cached
memory size increases beyond the
[`cache-max-memory-size` setting](/influxdb/v1.8/administration/config/#cache-max-memory-size-1g)
in the configuration file.
By default, `cache-max-memory-size` is set to 512mb.
This value is fine for most workloads, but is too small for larger write volumes
or for datasets with higher [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality).
If you have lots of RAM you could set it to `0` to disable the cached memory
limit and never get this error.
You can also examine the `memBytes` field in the`cache` measurement in the
[`_internal` database](/influxdb/v1.8/administration/server_monitoring/#internal-monitoring)
to get a sense of how big the caches are in memory.
**Resources:**
[Database Configuration](/influxdb/v1.8/administration/config/)
## `already killed`
The `already killed` error occurs when a query has already been killed, but
there are subsequent kill attempts before the query has exited.
When a query is killed, it may not exit immediately.
It will be in the `killed` state, which means the signal has been sent, but the
query itself has not hit an interrupt point.
**Resources:**
[Query management](/influxdb/v1.0/troubleshooting/query_management/)
## Common `-import` errors
Find common errors that occur when importing data in the command line interface (CLI).
1. (Optional) Customize how to view `-import` errors and output by running any of the following commands:
- Send errors and output to a new file: `influx -import -path={import-file}.gz -compressed {new-file} 2>&1`
- Send errors and output to separate files: `influx -import -path={import-file}.gz -compressed > {output-file} 2> {error-file}`
- Send errors to a new file: `influx -import -path={import-file}.gz -compressed 2> {new-file}`
- Send output to a new file: `influx -import -path={import-file}.gz -compressed {new-file}`
2. Review import errors for possible causes to resolve:
- [Inconsistent data types](#inconsistent-data-types)
- [Data points older than retention policy](#data-points-older-than-retention-policy)
- [Unnamed import file](#unnamed-import-file)
- [Docker container cannot read host files](#docker-container-cannot-read-host-files)
>**Note:** To learn how to use the `-import` command, see [Import data from a file with `-import`](/influxdb/v1.8/tools/shell/#import-data-from-a-file-with-import).
### Inconsistent data types
**Error:** `partial write: field type conflict:`
This error occurs when fields in an imported measurement have inconsistent data types. Make sure all fields in a measurement have the same data type, such as float64, int64, and so on.
### Data points older than retention policy
**Error:** `partial write: points beyond retention policy dropped={number-of-points-dropped}`
This error occurs when an imported data point is older than the specified retention policy and dropped. Verify the correct retention policy is specified in the import file.
### Unnamed import file
**Error:** `reading standard input: /path/to/directory: is a directory`
This error occurs when the `-import` command doesn't include the name of an import file. Specify the file to import, for example: `$ influx -import -path={filename}.txt -precision=s`
### Docker container cannot read host files
**Error:** `open /path/to/file: no such file or directory`
This error occurs when the Docker container cannot read files on the host machine. To make host machine files readable, complete the following procedure.
#### Make host machine files readable to Docker
1. Create a directory, and then copy files to import into InfluxDB to this directory.
2. When you launch the Docker container, mount the new directory on the InfluxDB container by running the following command:
docker run -v /dir/path/on/host:/dir/path/in/container
3. Verify the Docker container can read host machine files by running the following command:
influx -import -path=/path/in/container

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,180 @@
---
title: InfluxQL query management
menu:
influxdb_1_8:
name: Query management
weight: 20
parent: Troubleshooting
---
Manage your InfluxQL queries using the following:
- [SHOW QUERIES](#list-currently-running-queries-with-show-queries) to identify currently-running queries
- [KILL QUERIES](#stop-currently-running-queries-with-kill-query) to stop queries overloading your system
- [Configuration settings](#configuration-settings-for-query-management) to prevent and halt the execution of inefficient queries
> The commands and configurations provided on this page are for **Influx Query Language (InfluxQL) only** -- **no equivalent set of Flux commands and configurations currently exists**. For the most current Flux documentation, see [Get started with Flux](/flux/v0.50/introduction/getting-started/).
## List currently-running queries with `SHOW QUERIES`
`SHOW QUERIES` lists the query id, query text, relevant database, and duration
of all currently-running queries on your InfluxDB instance.
#### Syntax
```sql
SHOW QUERIES
```
#### Example
```
> SHOW QUERIES
qid query database duration status
--- ----- -------- -------- ------
37 SHOW QUERIES 100368u running
36 SELECT mean(myfield) FROM mymeas mydb 3s running
```
##### Explanation of the output
- `qid`: The id number of the query. Use this value with [`KILL - QUERY`](/influxdb/v1.7/troubleshooting/query_management/#stop-currently-running-queries-with-kill-query).
- `query`: The query text.
- `database`: The database targeted by the query.
- `duration`: The length of time that the query has been running.
See [Query Language Reference](/influxdb/v1.7/query_language/spec/#durations)
for an explanation of time units in InfluxDB databases.
{{% note %}}
`SHOW QUERIES` may output a killed query and continue to increment its duration
until the query record is cleared from memory.
{{% /note %}}
- `status`: The current status of the query.
## Stop currently-running queries with `KILL QUERY`
`KILL QUERY` tells InfluxDB to stop running the relevant query.
#### Syntax
Where `qid` is the query ID, displayed in the [`SHOW QUERIES`](/influxdb/v1.3/troubleshooting/query_management/#list-currently-running-queries-with-show-queries) output:
```sql
KILL QUERY <qid>
```
***InfluxDB Enterprise clusters:*** To kill queries on a cluster, you need to specify the query ID (qid) and the TCP host (for example, `myhost:8088`),
available in the `SHOW QUERIES` output.
```sql
KILL QUERY <qid> ON "<host>"
```
A successful `KILL QUERY` query returns no results.
#### Examples
```sql
-- kill query with qid of 36 on the local host
> KILL QUERY 36
>
```
```sql
-- kill query on InfluxDB Enterprise cluster
> KILL QUERY 53 ON "myhost:8088"
>
```
## Configuration settings for query management
The following configuration settings are in the
[coordinator](/influxdb/v1.8/administration/config/#query-management-settings) section of the
configuration file.
### `max-concurrent-queries`
The maximum number of running queries allowed on your instance.
The default setting (`0`) allows for an unlimited number of queries.
If you exceed `max-concurrent-queries`, InfluxDB does not execute the query and
outputs the following error:
```
ERR: max concurrent queries reached
```
### `query-timeout`
The maximum time for which a query can run on your instance before InfluxDB
kills the query.
The default setting (`"0"`) allows queries to run with no time restrictions.
This setting is a [duration literal](/influxdb/v1.8/query_language/spec/#durations).
If your query exceeds the query timeout, InfluxDB kills the query and outputs
the following error:
```
ERR: query timeout reached
```
### `log-queries-after`
The maximum time a query can run after which InfluxDB logs the query with a
`Detected slow query` message.
The default setting (`"0"`) will never tell InfluxDB to log the query.
This setting is a [duration literal](/influxdb/v1.8/query_language/spec/#durations).
Example log output with `log-queries-after` set to `"1s"`:
```
[query] 2016/04/28 14:11:31 Detected slow query: SELECT mean(usage_idle) FROM cpu WHERE time >= 0 GROUP BY time(20s) (qid: 3, database: telegraf, threshold: 1s)
```
`qid` is the id number of the query.
Use this value with [`KILL QUERY`](/influxdb/v1.8/troubleshooting/query_management/#stop-currently-running-queries-with-kill-query).
The default location for the log output file is `/var/log/influxdb/influxdb.log`. However on systems that use systemd (most modern Linux distributions) those logs are output to `journalctl`. You should be able to view the InfluxDB logs using the following command: `journalctl -u influxdb`
### `max-select-point`
The maximum number of [points](/influxdb/v1.8/concepts/glossary/#point) that a
`SELECT` statement can process.
The default setting (`0`) allows the `SELECT` statement to process an unlimited
number of points.
If your query exceeds `max-select-point`, InfluxDB kills the query and outputs
the following error:
```
ERR: max number of points reached
```
### `max-select-series`
The maximum number of [series](/influxdb/v1.8/concepts/glossary/#series) that a
`SELECT` statement can process.
The default setting (`0`) allows the `SELECT` statement to process an unlimited
number of series.
If your query exceeds `max-select-series`, InfluxDB does not execute the query
and outputs the following error:
```
ERR: max select series count exceeded: <query_series_count> series
```
### `max-select-buckets`
The maximum number of `GROUP BY time()` buckets that a query can process.
The default setting (`0`) allows a query to process an unlimited number of
buckets.
If your query exceeds `max-select-buckets`, InfluxDB does not execute the query
and outputs the following error:
```
ERR: max select bucket count exceeded: <query_bucket_count> buckets
```

View File

@ -0,0 +1,18 @@
---
title: Write protocols in InfluxDB
description: Covers the InfluxDB line protocol and a tutorial using line protocol to write data to InfluxDB.
menu:
influxdb_1_8:
name: Write protocols
weight: 80
---
The InfluxDB line protocol is a text based format for writing points to InfluxDB databases.
## [InfluxDB line protocol tutorial](/influxdb/v1.8/write_protocols/line_protocol_tutorial/)
The [InfluxDB line protocol tutorial](/influxdb/v1.8/write_protocols/line_protocol_tutorial/) uses temperature data to introduce you to the InfluxDB line protocol and writing data to InfluxDB.
## [InfluxDB line protocol reference](/influxdb/v1.8/write_protocols/line_protocol_reference/)
The [InfluxDB line protocol reference](/influxdb/v1.8/write_protocols/line_protocol_reference/) covers the InfluxDB line protocol syntax, data types, and guidelines.

View File

@ -0,0 +1,287 @@
---
title: InfluxDB line protocol reference
aliases:
- /influxdb/v1.8/write_protocols/write_syntax/
menu:
influxdb_1_8:
name: InfluxDB line protocol reference
weight: 10
parent: Write protocols
---
InfluxDB line protocol is a text based format for writing points to InfluxDB.
## Line protocol syntax
```
<measurement>[,<tag_key>=<tag_value>[,<tag_key>=<tag_value>]] <field_key>=<field_value>[,<field_key>=<field_value>] [<timestamp>]
```
Line protocol accepts the newline character `\n` and is whitespace-sensitive.
>**Note** Line protocol does not support the newline character `\n` in tag values or field values.
### Syntax description
InfluxDB line protocol informs InfluxDB of the data's measurement, tag set, field set, and timestamp.
| Element | Optional/Required | Description | Type<br>(See [data types](#data-types) for more information.) |
| :-------| :---------------- |:----------- |:----------------
| [Measurement](/influxdb/v1.8/concepts/glossary/#measurement) | Required | The measurement name. InfluxDB accepts one measurement per point. | String
| [Tag set](/influxdb/v1.8/concepts/glossary/#tag-set) | Optional | All tag key-value pairs for the point. | [Tag keys](/influxdb/v1.8/concepts/glossary/#tag-key) and [tag values](/influxdb/v1.8/concepts/glossary/#tag-value) are both strings.
| [Field set](/influxdb/v1.8/concepts/glossary/#field-set) | Required. Points must have at least one field. | All field key-value pairs for the point. | [Field keys](/influxdb/v1.8/concepts/glossary/#field-key) are strings. [Field values](/influxdb/v1.8/concepts/glossary/#field-value) can be floats, integers, strings, or Booleans.
| [Timestamp](/influxdb/v1.8/concepts/glossary/#timestamp) | Optional. InfluxDB uses the server's local nanosecond timestamp in UTC if the timestamp is not included with the point. | The timestamp for the data point. InfluxDB accepts one timestamp per point. | Unix nanosecond timestamp. Specify alternative precisions with the [InfluxDB API](/influxdb/v1.8/tools/api/#write-http-endpoint).
> #### Performance tips:
>
- Before sending data to InfluxDB, sort by tag key to match the results from the
[Go bytes.Compare function](http://golang.org/pkg/bytes/#Compare).
- To significantly improve compression, use the coarsest [precision](/influxdb/v1.8/tools/api/#write-http-endpoint) possible for timestamps.
- Use the Network Time Protocol (NTP) to synchronize time between hosts. InfluxDB uses a host's local time in UTC to assign timestamps to data. If a host's clock isn't synchronized with NTP, the data that the host writes to InfluxDB may have inaccurate timestamps.
## Data types
| Datatype | Element(s) | Description |
| :----------- | :------------------------ |:------------ |
| Float | Field values | Default numerical type. IEEE-754 64-bit floating-point numbers (except NaN or +/- Inf). Examples: `1`, `1.0`, `1.e+78`, `1.E+78`. |
| Integer | Field values | Signed 64-bit integers (-9223372036854775808 to 9223372036854775807). Specify an integer with a trailing `i` on the number. Example: `1i`. |
| String | Measurements, tag keys, tag values, field keys, field values | Length limit 64KB. |
| Boolean | Field values | Stores TRUE or FALSE values.<br><br>TRUE write syntax:`[t, T, true, True, TRUE]`.<br><br>FALSE write syntax:`[f, F, false, False, FALSE]` |
| Timestamp | Timestamps | Unix nanosecond timestamp. Specify alternative precisions with the [InfluxDB API](/influxdb/v1.8/tools/api/#write-http-endpoint). The minimum valid timestamp is `-9223372036854775806` or `1677-09-21T00:12:43.145224194Z`. The maximum valid timestamp is `9223372036854775806` or `2262-04-11T23:47:16.854775806Z`. |
#### Boolean syntax for writes and queries
Acceptable Boolean syntax differs for data writes and data queries.
For more information, see
[Frequently asked questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#why-can-t-i-query-boolean-field-values).
#### Field type discrepancies
In a measurement, a field's type cannot differ in a [shard](/influxdb/v1.8/concepts/glossary/#shard), but can differ across
shards.
To learn how field value type discrepancies can affect `SELECT *` queries, see
[How does InfluxDB handle field type discrepancies across shards?](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-field-type-discrepancies-across-shards).
### Examples
#### Write the field value `-1.234456e+78` as a float to InfluxDB
```sql
> INSERT mymeas value=-1.234456e+78
```
InfluxDB supports field values specified in scientific notation.
#### Write a field value `1.0` as a float to InfluxDB
```sql
> INSERT mymeas value=1.0
```
#### Write the field value `1` as a float to InfluxDB
```sql
> INSERT mymeas value=1
```
#### Write the field value `1` as an integer to InfluxDB
```sql
> INSERT mymeas value=1i
```
#### Write the field value `stringing along` as a string to InfluxDB
```sql
> INSERT mymeas value="stringing along"
```
Always double quote string field values. More on quoting [below](#quoting).
#### Write the field value `true` as a Boolean to InfluxDB
```sql
> INSERT mymeas value=true
```
Do not quote Boolean field values.
The following statement writes `true` as a string field value to InfluxDB:
```sql
> INSERT mymeas value="true"
```
#### Attempt to write a string to a field that previously accepted floats
If the timestamps on the float and string are stored in the same shard:
```sql
> INSERT mymeas value=3 1465934559000000000
> INSERT mymeas value="stringing along" 1465934559000000001
ERR: {"error":"field type conflict: input field \"value\" on measurement \"mymeas\" is type string, already exists as type float"}
```
If the timestamps on the float and string are not stored in the same shard:
```sql
> INSERT mymeas value=3 1465934559000000000
> INSERT mymeas value="stringing along" 1466625759000000000
>
```
## Quoting, special characters, and additional naming guidelines
### Quoting
| Element | Double quotes | Single quotes |
| :------ | :------------ |:------------- |
| Timestamp | Never | Never |
| Measurements, tag keys, tag values, field keys | Never* | Never* |
| Field values | Double quote string field values. Do not double quote floats, integers, or Booleans. | Never |
\* InfluxDB line protocol allows users to double and single quote measurement names, tag
keys, tag values, and field keys.
It will, however, assume that the double or single quotes are part of the name,
key, or value.
This can complicate query syntax (see the example below).
#### Examples
##### Invalid line protocol - Double quote the timestamp
```sql
> INSERT mymeas value=9 "1466625759000000000"
ERR: {"error":"unable to parse 'mymeas value=9 \"1466625759000000000\"': bad timestamp"}
```
Double quoting (or single quoting) the timestamp yields a `bad timestamp`
error.
##### Semantic error - Double quote a Boolean field value
```sql
> INSERT mymeas value="true"
> SHOW FIELD KEYS FROM "mymeas"
name: mymeas
------------
fieldKey fieldType
value string
```
InfluxDB assumes that all double quoted field values are strings.
##### Semantic error - Double quote a measurement name
```sql
> INSERT "mymeas" value=200
> SHOW MEASUREMENTS
name: measurements
------------------
name
"mymeas"
> SELECT * FROM mymeas
> SELECT * FROM "mymeas"
> SELECT * FROM "\"mymeas\""
name: "mymeas"
--------------
time value
2016-06-14T20:36:21.836131014Z 200
```
If you double quote a measurement in line protocol, any queries on that
measurement require both double quotes and escaped (`\`) double quotes in the
`FROM` clause.
### Special characters
You must use a backslash character `\` to escape the following special characters:
* In string field values, you must escape:
* double quotes
* backslash character
For example, `\"` escapes double quote.
>#### Note on backslashes:
>
* If you use multiple backslashes, they must be escaped. Influx interprets backslashes as follows:
* `\` or `\\` interpreted as `\`
* `\\\` or `\\\\` interpreted as `\\`
* `\\\\\` or `\\\\\\` interpreted as `\\\`, and so on
* In tag keys, tag values, and field keys, you must escape:
* commas
* equal signs
* spaces
For example, `\,` escapes a comma.
* In measurements, you must escape:
* commas
* spaces
You do not need to escape other special characters.
#### Examples
##### Write a point with special characters
```sql
> INSERT "measurement\ with\ quo⚡es\ and\ emoji",tag\ key\ with\ sp🚀ces=tag\,value\,with"commas" field_k\ey="string field value, only \" need be esc🍭ped"
```
The system writes a point where the measurement is `"measurement with quo⚡es and emoji"`, the tag key is `tag key with sp🚀ces`, the
tag value is `tag,value,with"commas"`, the field key is `field_k\ey` and the field value is `string field value, only " need be esc🍭ped`.
### Additional naming guidelines
`#` at the beginning of the line is a valid comment character for line protocol.
InfluxDB will ignore all subsequent characters until the next newline `\n`.
Measurement names, tag keys, tag values, field keys, and field values are
case sensitive.
InfluxDB line protocol accepts
[InfluxQL keywords](/influxdb/v1.8/query_language/spec/#keywords)
as [identifier](/influxdb/v1.8/concepts/glossary/#identifier) names.
In general, we recommend avoiding using InfluxQL keywords in your schema as
it can cause
[confusion](/influxdb/v1.8/troubleshooting/errors/#error-parsing-query-found-expected-identifier-at-line-char) when querying the data.
> **Note:** Avoid using the reserved keys `_field` and `_measurement`. If these keys are included as a tag or field key, the associated point is discarded.
The keyword `time` is a special case.
`time` can be a
[continuous query](/influxdb/v1.8/concepts/glossary/#continuous-query-cq) name,
database name,
[measurement](/influxdb/v1.8/concepts/glossary/#measurement) name,
[retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) name,
[subscription](/influxdb/v1.8/concepts/glossary/#subscription) name, and
[user](/influxdb/v1.8/concepts/glossary/#user) name.
In those cases, `time` does not require double quotes in queries.
`time` cannot be a [field key](/influxdb/v1.8/concepts/glossary/#field-key) or
[tag key](/influxdb/v1.8/concepts/glossary/#tag-key);
InfluxDB rejects writes with `time` as a field key or tag key and returns an error.
See [Frequently Asked Questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#time) for more information.
## InfluxDB line protocol in practice
To learn how to write line protocol to the database, see [Tools](/influxdb/v1.8/tools/).
### Duplicate points
A point is uniquely identified by the measurement name, tag set, field set, and timestamp
If you write a point to a series with a timestamp that matches an existing point, the field set becomes a union of the old and new field set, and conflicts favor the new field set.
For a complete example of this behavior and how to avoid it, see
[How does InfluxDB handle duplicate points?](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)
### Duplicate keys
If you have a tag key and field key with the same name in a measurement, one of the keys will return appended with a `_1` in query results (and as a column header in Chronograf). For example, `location` and `location_1`. To query a duplicate key, drop the `_1` and use the InfluxQL `::tag` or `::field` syntax in your query, for example:
```sql
SELECT "location"::tag, "location"::field FROM "database_name"."retention_policy"."measurement"
```

View File

@ -0,0 +1,503 @@
---
title: InfluxDB line protocol tutorial
aliases:
- /influxdb/v1.8/write_protocols/line/
menu:
influxdb_1_8:
weight: 20
parent: Write protocols
---
The InfluxDB line protocol is a text based format for writing points to the
database.
Points must be in line protocol format for InfluxDB to successfully parse and
write points (unless you're using a [service plugin](/influxdb/v1.8/supported_protocols/)).
Using fictional temperature data, this page introduces InfluxDB line protocol.
It covers:
<table style="width:100%">
<tr>
<td><a href="#syntax">Syntax</a></td>
<td><a href="#data-types">Data Types</a></td>
<td><a href="#quoting">Quoting</a></td>
<td><a href="#special-characters-and-keywords">Special Characters and Keywords</a></td>
</tr>
</table>
The final section, [Writing data to InfluxDB](#writing-data-to-influxdb),
describes how to get data into InfluxDB and how InfluxDB handles Line
Protocol duplicates.
## Syntax
A single line of text in line protocol format represents one data point in InfluxDB.
It informs InfluxDB of the point's measurement, tag set, field set, and
timestamp.
The following code block shows a sample of line protocol and breaks it into its
individual components:
```
weather,location=us-midwest temperature=82 1465839830100400200
| -------------------- -------------- |
| | | |
| | | |
+-----------+--------+-+---------+-+---------+
|measurement|,tag_set| |field_set| |timestamp|
+-----------+--------+-+---------+-+---------+
```
Moving across each element in the diagram:
### Measurement
The name of the [measurement](/influxdb/v1.8/concepts/glossary/#measurement)
that you want to write your data to.
The measurement is required in line protocol.
In the example, the measurement name is `weather`.
### Tag set
The [tag(s)](/influxdb/v1.8/concepts/glossary/#tag) that you want to include
with your data point.
Tags are optional in line protocol.
> **Note:** Avoid using the reserved keys `_field`, `_measurement`, and `time`. If reserved keys are included as a tag or field key, the associated point is discarded.
Notice that the measurement and tag set are separated by a comma and no spaces.
Separate tag key-value pairs with an equals sign `=` and no spaces:
```
<tag_key>=<tag_value>
```
Separate multiple tag-value pairs with a comma and no spaces:
```
<tag_key>=<tag_value>,<tag_key>=<tag_value>
```
In the example, the tag set consists of one tag: `location=us-midwest`.
Adding another tag (`season=summer`) to the example looks like this:
```
weather,location=us-midwest,season=summer temperature=82 1465839830100400200
```
For best performance you should sort tags by key before sending them to the
database.
The sort should match the results from the
[Go bytes.Compare function](http://golang.org/pkg/bytes/#Compare).
### Whitespace I
Separate the measurement and the field set or, if you're including a tag set
with your data point, separate the tag set and the field set with a whitespace.
The whitespace is required in line protocol.
Valid line protocol with no tag set:
```
weather temperature=82 1465839830100400200
```
### Field set
The [field(s)](/influxdb/v1.8/concepts/glossary/#field) for your data point.
Every data point requires at least one field in line protocol.
Separate field key-value pairs with an equals sign `=` and no spaces:
```
<field_key>=<field_value>
```
Separate multiple field-value pairs with a comma and no spaces:
```
<field_key>=<field_value>,<field_key>=<field_value>
```
In the example, the field set consists of one field: `temperature=82`.
Adding another field (`humidity=71`) to the example looks like this:
```
weather,location=us-midwest temperature=82,humidity=71 1465839830100400200
```
### Whitespace II
Separate the field set and the optional timestamp with a whitespace.
The whitespace is required in line protocol if you're including a timestamp.
### Timestamp
The [timestamp](/influxdb/v1.8/concepts/glossary/#timestamp) for your data
point in nanosecond-precision Unix time.
The timestamp is optional in line protocol.
If you do not specify a timestamp for your data point InfluxDB uses the server's
local nanosecond timestamp in UTC.
In the example, the timestamp is `1465839830100400200` (that's
`2016-06-13T17:43:50.1004002Z` in RFC3339 format).
The line protocol below is the same data point but without the timestamp.
When InfluxDB writes it to the database it uses your server's
local timestamp instead of `2016-06-13T17:43:50.1004002Z`.
```
weather,location=us-midwest temperature=82
```
Use the InfluxDB API to specify timestamps with a precision other than nanoseconds,
such as microseconds, milliseconds, or seconds.
We recommend using the coarsest precision possible as this can result in
significant improvements in compression.
See the [API Reference](/influxdb/v1.8/tools/api/#write-http-endpoint) for more information.
> #### Setup Tip:
>
Use the Network Time Protocol (NTP) to synchronize time between hosts.
InfluxDB uses a host's local time in UTC to assign timestamps to data; if
hosts' clocks aren't synchronized with NTP, the timestamps on the data written
to InfluxDB can be inaccurate.
## Data types
This section covers the data types of line protocol's major components:
[measurements](/influxdb/v1.8/concepts/glossary/#measurement),
[tag keys](/influxdb/v1.8/concepts/glossary/#tag-key),
[tag values](/influxdb/v1.8/concepts/glossary/#tag-value),
[field keys](/influxdb/v1.8/concepts/glossary/#field-key),
[field values](/influxdb/v1.8/concepts/glossary/#field-value), and
[timestamps](/influxdb/v1.8/concepts/glossary/#timestamp).
Measurements, tag keys, tag values, and field keys are always strings.
> **Note:**
Because InfluxDB stores tag values as strings, InfluxDB cannot perform math on
tag values.
In addition, InfluxQL [functions](/influxdb/v1.8/query_language/functions/)
do not accept a tag value as a primary argument.
It's a good idea to take into account that information when designing your
[schema](/influxdb/v1.8/concepts/glossary/#schema).
Timestamps are UNIX timestamps.
The minimum valid timestamp is `-9223372036854775806` or `1677-09-21T00:12:43.145224194Z`.
The maximum valid timestamp is `9223372036854775806` or `2262-04-11T23:47:16.854775806Z`.
As mentioned above, by default, InfluxDB assumes that timestamps have
nanosecond precision.
See the [API Reference](/influxdb/v1.8/tools/api/#write-http-endpoint) for how to specify
alternative precisions.
Field values can be floats, integers, strings, or Booleans:
* Floats - by default, InfluxDB assumes all numerical field values are floats.
Store the field value `82` as a float:
```
weather,location=us-midwest temperature=82 1465839830100400200
```
* Integers - append an `i` to the field value to tell InfluxDB to store the
number as an integer.
Store the field value `82` as an integer:
```
weather,location=us-midwest temperature=82i 1465839830100400200
```
* Strings - double quote string field values (more on quoting in line protocol
[below](#quoting)).
Store the field value `too warm` as a string:
```
weather,location=us-midwest temperature="too warm" 1465839830100400200
```
* Booleans - specify TRUE with `t`, `T`, `true`, `True`, or `TRUE`. Specify
FALSE with `f`, `F`, `false`, `False`, or `FALSE`.
Store the field value `true` as a Boolean:
```
weather,location=us-midwest too_hot=true 1465839830100400200
```
> **Note:** Acceptable Boolean syntax differs for data writes and data
queries. See
[Frequently Asked Questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#why-can-t-i-query-boolean-field-values)
for more information.
Within a measurement, a field's type cannot differ within a
[shard](/influxdb/v1.8/concepts/glossary/#shard), but it can differ across
shards. For example, writing an integer to a field that previously accepted
floats fails if InfluxDB attempts to store the integer in the same shard as the
floats:
```sql
> INSERT weather,location=us-midwest temperature=82 1465839830100400200
> INSERT weather,location=us-midwest temperature=81i 1465839830100400300
ERR: {"error":"field type conflict: input field \"temperature\" on measurement \"weather\" is type int64, already exists as type float"}
```
But, writing an integer to a field that previously accepted floats succeeds if
InfluxDB stores the integer in a new shard:
```sql
> INSERT weather,location=us-midwest temperature=82 1465839830100400200
> INSERT weather,location=us-midwest temperature=81i 1467154750000000000
>
```
See
[Frequently Asked Questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-field-type-discrepancies-across-shards)
for how field value type discrepancies can affect `SELECT *` queries.
## Quoting
This section covers when not to and when to double (`"`) or single (`'`)
quote in line protocol.
Moving from never quote to please do quote:
* Never double or single quote the timestamp.
It's not valid line protocol.
Example:
```
> INSERT weather,location=us-midwest temperature=82 "1465839830100400200"
ERR: {"error":"unable to parse 'weather,location=us-midwest temperature=82 \"1465839830100400200\"': bad timestamp"}
```
* Never single quote field values (even if they're strings!).
It's also not valid line protocol.
Example:
```
> INSERT weather,location=us-midwest temperature='too warm'
ERR: {"error":"unable to parse 'weather,location=us-midwest temperature='too warm'': invalid boolean"}
```
* Do not double or single quote measurement names, tag keys, tag values, and field
keys.
It is valid line protocol but InfluxDB assumes that the quotes are part of the
name.
Example:
```
> INSERT weather,location=us-midwest temperature=82 1465839830100400200
> INSERT "weather",location=us-midwest temperature=87 1465839830100400200
> SHOW MEASUREMENTS
name: measurements
------------------
name
"weather"
weather
```
To query data in `"weather"` you need to double quote the measurement name and
escape the measurement's double quotes:
```
> SELECT * FROM "\"weather\""
name: "weather"
---------------
time location temperature
2016-06-13T17:43:50.1004002Z us-midwest 87
```
* Do not double quote field values that are floats, integers, or Booleans.
InfluxDB will assume that those values are strings.
Example:
```
> INSERT weather,location=us-midwest temperature="82"
> SELECT * FROM weather WHERE temperature >= 70
>
```
* Do double quote field values that are strings.
Example:
```
> INSERT weather,location=us-midwest temperature="too warm"
> SELECT * FROM weather
name: weather
-------------
time location temperature
2016-06-13T19:10:09.995766248Z us-midwest too warm
```
## Special characters and keywords
### Special characters
For tag keys, tag values, and field keys always use a backslash character `\`
to escape:
* commas `,`
```
weather,location=us\,midwest temperature=82 1465839830100400200
```
* equal signs `=`
```
weather,location=us-midwest temp\=rature=82 1465839830100400200
```
* spaces
```
weather,location\ place=us-midwest temperature=82 1465839830100400200
```
For measurements always use a backslash character `\` to escape:
* commas `,`
```
wea\,ther,location=us-midwest temperature=82 1465839830100400200
```
* spaces
```
wea\ ther,location=us-midwest temperature=82 1465839830100400200
```
For string field values use a backslash character `\` to escape:
* double quotes `"`
```
weather,location=us-midwest temperature="too\"hot\"" 1465839830100400200
```
line protocol does not require users to escape the backslash character `\` but
will not complain if you do. For example, inserting the following:
```
weather,location=us-midwest temperature_str="too hot/cold" 1465839830100400201
weather,location=us-midwest temperature_str="too hot\cold" 1465839830100400202
weather,location=us-midwest temperature_str="too hot\\cold" 1465839830100400203
weather,location=us-midwest temperature_str="too hot\\\cold" 1465839830100400204
weather,location=us-midwest temperature_str="too hot\\\\cold" 1465839830100400205
weather,location=us-midwest temperature_str="too hot\\\\\cold" 1465839830100400206
```
Will be interpreted as follows (notice that a single and double backslash produce the same record):
```sql
> SELECT * FROM "weather"
name: weather
time location temperature_str
---- -------- ---------------
1465839830100400201 us-midwest too hot/cold
1465839830100400202 us-midwest too hot\cold
1465839830100400203 us-midwest too hot\cold
1465839830100400204 us-midwest too hot\\cold
1465839830100400205 us-midwest too hot\\cold
1465839830100400206 us-midwest too hot\\\cold
```
All other special characters also do not require escaping.
For example, line protocol handles emojis with no problem:
```sql
> INSERT we⛅ther,location=us-midwest temper🔥ture=82 1465839830100400200
> SELECT * FROM "we⛅ther"
name: we⛅ther
------------------
time location temper🔥ture
1465839830100400200 us-midwest 82
```
### Keywords
Line protocol accepts
[InfluxQL keywords](/influxdb/v1.8/query_language/spec/#keywords)
as [identifier](/influxdb/v1.8/concepts/glossary/#identifier) names.
In general, we recommend avoiding using InfluxQL keywords in your schema as
it can cause
[confusion](/influxdb/v1.8/troubleshooting/errors/#error-parsing-query-found-expected-identifier-at-line-char) when querying the data.
The keyword `time` is a special case.
`time` can be a
[continuous query](/influxdb/v1.8/concepts/glossary/#continuous-query-cq) name,
database name,
[measurement](/influxdb/v1.8/concepts/glossary/#measurement) name,
[retention policy](/influxdb/v1.8/concepts/glossary/#retention-policy-rp) name,
[subscription](/influxdb/v1.8/concepts/glossary/#subscription) name, and
[user](/influxdb/v1.8/concepts/glossary/#user) name.
In those cases, `time` does not require double quotes in queries.
`time` cannot be a [field key](/influxdb/v1.8/concepts/glossary/#field-key) or
[tag key](/influxdb/v1.8/concepts/glossary/#tag-key);
InfluxDB rejects writes with `time` as a field key or tag key and returns an error.
See [Frequently Asked Questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#time) for more information.
## Writing data to InfluxDB
### Getting data in the database
Now that you know all about the InfluxDB line protocol, how do you actually get the
line protocol to InfluxDB?
Here, we'll give two quick examples and then point you to the
[Tools](/influxdb/v1.8/tools/) sections for further
information.
#### InfluxDB API
Write data to InfluxDB using the InfluxDB API.
Send a `POST` request to the `/write` endpoint and provide your line protocol in
the request body:
```bash
curl -i -XPOST "http://localhost:8086/write?db=science_is_cool" --data-binary 'weather,location=us-midwest temperature=82 1465839830100400200'
```
For in-depth descriptions of query string parameters, status codes, responses,
and more examples, see the [API Reference](/influxdb/v1.8/tools/api/#write-http-endpoint).
#### CLI
Write data to InfluxDB using the InfluxDB command line interface (CLI).
[Launch](/influxdb/v1.8/tools/shell/#launch-influx) the CLI, use the relevant
database, and put `INSERT` in
front of your line protocol:
```sql
INSERT weather,location=us-midwest temperature=82 1465839830100400200
```
You can also use the CLI to
[import](/influxdb/v1.8/tools/shell/#import-data-from-a-file-with-import) Line
Protocol from a file.
There are several ways to write data to InfluxDB.
See the [Tools](/influxdb/v1.8/tools/) section for more
on the [InfluxDB API](/influxdb/v1.8/tools/api/#write-http-endpoint), the
[CLI](/influxdb/v1.8/tools/shell/), and the available Service Plugins (
[UDP](/influxdb/v1.8/tools/udp/),
[Graphite](/influxdb/v1.8/tools/graphite/),
[CollectD](/influxdb/v1.8/tools/collectd/), and
[OpenTSDB](/influxdb/v1.8/tools/opentsdb/)).
### Duplicate points
A point is uniquely identified by the measurement name, tag set, and timestamp.
If you submit line protocol with the same measurement, tag set, and timestamp,
but with a different field set, the field set becomes the union of the old
field set and the new field set, where any conflicts favor the new field set.
For a complete example of this behavior and how to avoid it, see
[How does InfluxDB handle duplicate point?](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points)

View File

@ -0,0 +1,9 @@
{{ $expandLabel := .Get 0 }}
<div class="expand">
<p class="expand-label">
<span class="expand-toggle"></span><span>{{ $expandLabel }}</span>
</p>
<div class="expand-content" style="display: none;" >
{{ .Inner | safeHTML}}
</div>
</div>