diff --git a/content/influxdb/v1.3/_index.md b/content/influxdb/v1.3/_index.md new file mode 100644 index 000000000..13fb85846 --- /dev/null +++ b/content/influxdb/v1.3/_index.md @@ -0,0 +1,33 @@ +--- +title: InfluxDB 1.3 documentation + +menu: + influxdb: + name: v1.3 + identifier: influxdb_1_3 + weight: 30 +--- + +InfluxDB is a [time series database](https://en.wikipedia.org/wiki/Time_series_database) built from the ground up to handle high write and query loads. +It is the second piece of the +[TICK stack](https://influxdata.com/time-series-platform/). +InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics. + +## Key Features + +Here are some of the features that InfluxDB currently supports that make it a great choice for working with time series data. + +* Custom high performance datastore written specifically for time series data. +The TSM engine allows for high ingest speed and data compression. +* Written entirely in Go. +It compiles into a single binary with no external dependencies. +* Simple, high performing write and query HTTP(S) APIs. +* Plugins support for other data ingestion protocols such as Graphite, collectd, and OpenTSDB. +* Expressive SQL-like query language tailored to easily query aggregated data. +* Tags allow series to be indexed for fast and efficient queries. +* Retention policies efficiently auto-expire stale data. +* Continuous queries automatically compute aggregate data to make frequent queries more efficient. +* Built in web admin interface. + +However, the open source edition of InfluxDB runs on a single node. If your requirements dictate a high-availability setup +to eliminate a single point of failure, you should explore [InfluxDB Enterprise Edition](https://docs.influxdata.com/influxdb/v1.3/high_availability/). diff --git a/content/influxdb/v1.3/about_the_project/_index.md b/content/influxdb/v1.3/about_the_project/_index.md new file mode 100644 index 000000000..4d151b82a --- /dev/null +++ b/content/influxdb/v1.3/about_the_project/_index.md @@ -0,0 +1,25 @@ +--- +title: About the project +menu: + influxdb_1_3: + name: About the project + weight: 10 +--- + +## [Release Notes/Changelog](/influxdb/v1.3/about_the_project/releasenotes-changelog/) + +## [Contributing](https://github.com/influxdata/influxdb/blob/master/CONTRIBUTING.md) + +## [CLA](https://influxdata.com/community/cla/) + +## [Licenses](https://github.com/influxdata/influxdb/blob/master/LICENSE) + +## Third Party Software +InfluxData products contain third party software, which means the copyrighted, patented, or otherwise legally protected +software of third parties that is incorporated in InfluxData products. + +Third party suppliers make no representation nor warranty with respect to such third party software or any portion thereof. +Third party suppliers assume no liability for any claim that might arise with respect to such third party software, nor for a +customer’s use of or inability to use the third party software. + +The [list of third party software components, including references to associated licenses and other materials](https://github.com/influxdata/influxdb/blob/1.3/LICENSE_OF_DEPENDENCIES.md), is maintained on a version by version basis. diff --git a/content/influxdb/v1.3/about_the_project/releasenotes-changelog.md b/content/influxdb/v1.3/about_the_project/releasenotes-changelog.md new file mode 100644 index 000000000..8a599cc28 --- /dev/null +++ b/content/influxdb/v1.3/about_the_project/releasenotes-changelog.md @@ -0,0 +1,677 @@ +--- +title: Release Notes/Changelog +aliases: + - /influxdb/v1.3/about_the_project/release-notes-changelog/ +menu: + influxdb_1_3: + weight: 1 + parent: About the project +--- +## v1.3.9 [2018-01-19] + +### Bugfixes + +- Improve performance when writes exceed `max-values-per-tag` or `max-series`. + +## v1.3.8 [2017-12-04] + +### Bugfixes + +- Add `influx_inspect inmem2tsi` command to convert existing in-memory (TSM-based) shards to the TSI (Time Series Index) format. +- Fix race condition in the merge iterator close method. +- Fix compaction aborting early and dropping remaining series. + +## v1.3.7 [2017-10-26] + +### Release Notes +Bug fix identified via Community and InfluxCloud. The build artifacts are now consistent with v1.3.5. + +### Bugfixes + +- Don't assume `which` is present in package post-install script. +- Fix use of `INFLUXD_OPTS` in service file. +- Fix missing man pages in new packaging output. +- Add RPM dependency on shadow-utils for `useradd`. +- Fix data deleted outside of specified time range when using `delete`. +- Fix data dropped incorrectly during compaction. +- Return `query.ErrQueryInterrupted` for a successful read on `InterruptCh`. +- Copy returned bytes from TSI meta functions. + +## v1.3.6 [2017-09-28] + +### Release Notes +Bug fix identified via Community and InfluxCloud. + +### Bugfixes +- Reduce how long it takes to walk the varrefs in an expression. +- Address panic: runtime error: invalid memory address or nil pointer dereference. +- Fix increased memory usage in cache and WAL readers for clusters with a large number of shards. +- Prevent deadlock when doing math on the result of a subquery. +- Fix several race conditions present in the shard and storage engine. +- Fix race condition on cache entry. + +### Release Notes +Bug fix identified via Community and InfluxCloud. + +### Bugfixes +- Fix race condition accessing `seriesByID` map. +- Fix deadlock when calling `SeriesIDsAllOrByExpr`. + +## v1.3.5 [2017-08-29] + +### Release Notes +Bug fix identified via Community and InfluxCloud. + +### Bugfixes +- Fix race condition accessing `seriesByID` map. +- Fix deadlock when calling `SeriesIDsAllOrByExpr`. + +## v1.3.4 [2017-08-23] + +### Release Notes +Bug fix identified via Community and InfluxCloud. + +### Bugfixes +- Fixed time boundaries for continuous queries with time zones. +- Fix time zone shifts when the shift happens on a time zone boundary. +- Parse time literals using the time zone in the select statement. +- Fix drop measurement not dropping all data. +- Fix backups when snapshot is empty. +- Eliminated cursor leak, resulting in an accumulation of .tsm.tmp files after compactions. +- Fix Deadlock when dropping measurement and writing. +- Ensure inputs are closed on error. Add runtime GC finalizer as additional guard to close iterators. +- Fix leaking tmp file when large compaction aborted. + +## v1.3.3 [2017-08-10] + +### Release Notes +Bug fix identified via Community and InfluxCloud. + +### Bugfixes + +- Resolves a memory leak when NewReaderIterator creates a nilFloatIterator, the reader is not closed. + +## v1.3.2 [2017-08-04] + +### Release Notes +Minor bug fixes were identified via Community and InfluxCloud. + +### Bugfixes + +- Interrupt "in-progress" TSM compactions. +- Prevent excessive memory usage when dropping series. +- Significantly improve performance of SHOW TAG VALUES. + +## v1.3.1 [2017-07-20] + +### Release Notes +Minor bug fixes were identified via Community and InfluxCloud. + +### Bugfixes + +- Ensure temporary TSM files get cleaned up when compaction aborted. +- Address deadlock issue causing 1.3.0 to become unresponsive. +- Duplicate points generated via INSERT after DELETE. +- Fix the CQ start and end times to use Unix timestamps. + +## v1.3.0 [2017-06-21] + +### Release Notes + +#### TSI + +Version 1.3.0 marks the first official release of InfluxDB's new time series index (TSI) engine. + +The TSI engine is a significant technical advancement in InfluxDB. +It offers a solution to the [time-structured merge tree](https://docs.influxdata.com/influxdb/v1.2/concepts/storage_engine/) engine's [high series cardinality issue](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter). +With TSI, the number of series should be unbounded by the memory on the server hardware and the number of existing series will have a negligible impact on database startup time. +See Paul Dix's blogpost [Path to 1 Billion Time Series: InfluxDB High Cardinality Indexing Ready for Testing](https://www.influxdata.com/path-1-billion-time-series-influxdb-high-cardinality-indexing-ready-testing/) for additional information. + +TSI is disabled by default in version 1.3. It should be considered an `experimental feature` and +is not recommended for production deployment at this time. +To enable TSI, uncomment the [`index-version` setting](/influxdb/v1.3/administration/config/#index-version-inmem) and set it to `tsi1`. +The `index-version` setting is in the `[data]` section of the configuration file. +Next, restart your InfluxDB instance. + +``` +[data] + dir = "/var/lib/influxdb/data" + index-version = "tsi1" +``` + +#### Continuous Query Statistics + +When enabled, each time a continuous query is completed, a number of details regarding the execution are written to the `cq_query` measurement of the internal monitor database (`_internal` by default). The tags and fields of interest are + +| tag / field | description | +|:----------------- |:-------------------------------------------------- | +| `db` | name of database | +| `cq` | name of continuous query | +| `durationNS` | query execution time in nanoseconds | +| `startTime` | lower bound of time range | +| `endTime` | upper bound of time range | +| `pointsWrittenOK` | number of points written to the target measurement | + + +* `startTime` and `endTime` are UNIX timestamps, in nanoseconds. +* The number of points written is also included in CQ log messages. + +### Removals + +The admin UI is removed and unusable in this release. The `[admin]` configuration section will be ignored. + +### Configuration Changes + +* The top-level config `bind-address` now defaults to `localhost:8088`. + The previous default was just `:8088`, causing the backup and restore port to be bound on all available interfaces (i.e. including interfaces on the public internet). + +The following new configuration options are available. + +#### `[http]` Section + +* `max-body-size` was added with a default of 25,000,000, but can be disabled by setting it to 0. + Specifies the maximum size (in bytes) of a client request body. When a client sends data that exceeds + the configured maximum size, a `413 Request Entity Too Large` HTTP response is returned. + +#### `[continuous_queries]` Section + +* `query-stats-enabled` was added with a default of `false`. When set to `true`, continuous query execution statistics are written to the default monitor store. + +### Features + +- Add WAL sync delay +- Add chunked request processing back into the Go client v2 +- Allow non-admin users to execute SHOW DATABASES +- Reduce memory allocations by reusing gzip.Writers across requests +- Add system information to /debug/vars +- Add modulo operator to the query language. +- Failed points during an import now result in a non-zero exit code +- Expose some configuration settings via SHOW DIAGNOSTICS +- Support single and multiline comments in InfluxQL +- Support timezone offsets for queries +- Add "integral" function to InfluxQL +- Add "non_negative_difference" function to InfluxQL +- Add bitwise AND, OR and XOR operators to the query language +- Write throughput/concurrency improvements +- Remove the admin UI +- Update to go1.8.1 +- Add max concurrent compaction limits +- Add TSI support tooling +- Track HTTP client requests for /write and /query with /debug/requests +- Write and compaction stability +- Add new profile endpoint for gathering all debug profiles and queries in single archive +- Add nanosecond duration literal support +- Optimize top() and bottom() using an incremental aggregator +- Maintain the tags of points selected by top() or bottom() when writing the results. +- Write CQ stats to the `_internal` database + +### Bugfixes + +- Several statements were missing the DefaultDatabase method +- Fix spelling mistake in HTTP section of config -- shared-sercret +- History file should redact passwords before saving to history +- Suppress headers in output for influx cli when they are the same +- Add chunked/chunk size as setting/options in cli +- Do not increment the continuous query statistic if no query is run +- Forbid wildcards in binary expressions +- Fix fill(linear) when multiple series exist and there are null values +- Update liner dependency to handle docker exec +- Bind backup and restore port to localhost by default +- Kill query not killing query +- KILL QUERY should work during all phases of a query +- Simplify admin user check. +- Significantly improve DROP DATABASE speed +- Return an error when an invalid duration literal is parsed +- Fix the time range when an exact timestamp is selected +- Fix query parser when using addition and subtraction without spaces +- Fix a regression when math was used with selectors +- Ensure the input for certain functions in the query engine are ordered +- Significantly improve shutdown speed for high cardinality databases +- Fix racy integration test +- Prevent overflowing or underflowing during window computation +- Enabled golint for admin, httpd, subscriber, udp, thanks @karlding +- Implicitly cast null to false in binary expressions with a boolean +- Restrict fill(none) and fill(linear) to be usable only with aggregate queries +- Restrict top() and bottom() selectors to be used with no other functions +- top() and bottom() now returns the time for every point +- Remove default upper time bound on DELETE queries +- Fix LIMIT and OFFSET for certain aggregate queries +- Refactor the subquery code and fix outer condition queries +- Fix compaction aborted log messages +- TSM compaction does not remove .tmp on error +- Set the CSV output to an empty string for null values +- Compaction exhausting disk resources in InfluxDB +- Small edits to the etc/config.sample.toml file +- Points beyond retention policy scope are dropped silently +- Fix TSM tmp file leaked on disk +- Fix large field keys preventing snapshot compactions +- URL query parameter credentials take priority over Authentication header +- TSI branch has duplicate tag values +- Out of memory when using HTTP API +- Check file count before attempting a TSI level compaction. +- index file fd leak in tsi branch +- Fix TSI non-contiguous compaction panic + +## v1.2.4 [2017-05-08] + +### Bugfixes + +- Prefix partial write errors with `partial write:` to generalize identification in other subsystems. + +## v1.2.3 [2017-04-17] + +### Bugfixes + +- Redact passwords before saving them to the history file. +- Add the missing DefaultDatabase method to several InfluxQL statements. +- Fix segment violation in models.Tags.Get. +- Simplify the admin user check. +- Fix a regression when math was used with selectors. +- Ensure the input for certain functions in the query engine are ordered. +- Fix issue where deleted `time` field keys created unparseable points. + +## v1.2.2 [2017-03-14] + +### Release Notes + +### Configuration Changes + +#### `[http]` Section + +* [`max-row-limit`](/influxdb/v1.3/administration/config/#max-row-limit-0) now defaults to `0`. +In versions 1.0 and 1.1, the default setting was `10000`, but due to a bug, the value in use in versions 1.0 and 1.1 was effectively `0`. +In versions 1.2.0 through 1.2.1, we fixed that bug, but the fix caused a breaking change for Grafana and Kapacitor users; users who had not set `max-row-limit` to `0` experienced truncated/partial data due to the `10000` row limit. +In version 1.2.2, we've changed the default `max-row-limit` setting to `0` to match the behavior in versions 1.0 and 1.1. + +### Bugfixes + +- Change the default [`max-row-limit`](/influxdb/v1.3/administration/config/#max-row-limit-0) setting from `10000` to `0` to prevent the absence of data in Grafana or Kapacitor. + +## v1.2.1 [2017-03-08] + +### Release Notes + +### Bugfixes + +- Treat non-reserved measurement names with underscores as normal measurements. +- Reduce the expression in a subquery to avoid a panic. +- Properly select a tag within a subquery. +- Prevent a panic when aggregates are used in an inner query with a raw query. +- Points missing after compaction. +- Point.UnmarshalBinary() bounds check. +- Interface conversion: tsm1.Value is tsm1.IntegerValue, not tsm1.FloatValue. +- Map types correctly when using a regex and one of the measurements is empty. +- Map types correctly when selecting a field with multiple measurements where one of the measurements is empty. +- Include IsRawQuery in the rewritten statement for meta queries. +- Fix race in WALEntry.Encode and Values.Deduplicate +- Fix panic in collectd when configured to read types DB from directory. +- Fix ORDER BY time DESC with ordering series keys. +- Fix mapping of types when the measurement uses a regular expression. +- Fix LIMIT and OFFSET when they are used in a subquery. +- Fix incorrect math when aggregates that emit different times are used. +- Fix EvalType when a parenthesis expression is used. +- Fix authentication when subqueries are present. +- Expand query dimensions from the subquery. +- Dividing aggregate functions with different outputs doesn't panic. +- Anchors not working as expected with case-insensitive regular expression. + +## v1.2.0 [2017-01-24] + +### Release Notes + +This release introduces a major new querying capability in the form of sub-queries, and provides several performance improvements, including a 50% or better gain in write performance on larger numbers of cores. The release adds some stability and memory-related improvements, as well as several CLI-related bug fixes. If upgrading from a prior version, please read the configuration changes in the following section before upgrading. + +### Configuration Changes + +The following new configuration options are available, if upgrading to `1.2.0` from prior versions. + +#### `[[collectd]]` Section + +* `security-level` which defaults to `"none"`. This field also accepts `"sign"` and `"encrypt"` and enables different levels of transmission security for the collectd plugin. +* `auth-file` which defaults to `"/etc/collectd/auth_file"`. Specifies where to locate the authentication file used to authenticate clients when using signed or encrypted mode. + +### Deprecations + +The stress tool `influx_stress` will be removed in a subsequent release. We recommend using [`influx-stress`](https://github.com/influxdata/influx-stress) as a replacement. + +### Features + +- Remove the override of GOMAXPROCS. +- Uncomment section headers from the default configuration file. +- Improve write performance significantly. +- Prune data in meta store for deleted shards. +- Update latest dependencies with Godeps. +- Introduce syntax for marking a partial response with chunking. +- Use X-Forwarded-For IP address in HTTP logger if present. +- Add support for secure transmission via collectd. +- Switch logging to use structured logging everywhere. +- [CLI feature request] USE retention policy for queries. +- Add clear command to CLI. +- Adding ability to use parameters in queries in the v2 client using the `Parameters` map in the `Query` struct. +- Allow add items to array config via ENV. +- Support subquery execution in the query language. +- Verbose output for SSL connection errors. +- Cache snapshotting performance improvements + +### Bugfixes + +- Fix potential race condition in correctness of tsm1_cache memBytes statistic. +- Fix broken error return on meta client's UpdateUser and DropContinuousQuery methods. +- Fix string quoting and significantly improve performance of `influx_inspect export`. +- CLI was caching db/rp for insert into statements. +- Fix CLI import bug when using self-signed SSL certificates. +- Fix cross-platform backup/restore. +- Ensures that all user privileges associated with a database are removed when the database is dropped. +- Return the time from a percentile call on an integer. +- Expand string and boolean fields when using a wildcard with `sample()`. +- Fix chuid argument order in init script. +- Reject invalid subscription URLs. +- CLI should use spaces for alignment, not tabs. +- 0.12.2 InfluxDB CLI client PRECISION returns "Unknown precision...". +- Fix parse key panic when missing tag value. +- Rentention Policy should not allow `INF` or `0` as a shard duration. +- Return Error instead of panic when decoding point values. +- Fix slice out of bounds panic when pruning shard groups. +- Drop database will delete /influxdb/data directory. +- Ensure Subscriber service can be disabled. +- Fix race in storage engine. +- InfluxDB should do a partial write on mismatched type errors. + +## v1.1.5 [2017-05-08] + +### Bugfixes + +- Redact passwords before saving them to the history file. +- Add the missing DefaultDatabase method to several InfluxQL statements. + +## v1.1.4 [2017-02-27] + +### Bugfixes + +- Backport from 1.2.0: Reduce GC allocations. + +## v1.1.3 [2017-02-17] + +### Bugfixes + +- Remove Tags.shouldCopy, replace with forceCopy on series creation. + +## v1.1.2 [2017-02-16] + +### Bugfixes + +- Fix memory leak when writing new series over HTTP. +- Fix series tag iteration segfault. +- Fix tag dereferencing panic. + +## v1.1.1 [2016-12-06] + +### Features + +- Update Go version to 1.7.4. + +### Bugfixes + +- Fix string fields w/ trailing slashes. +- Quote the empty string as an ident. +- Fix incorrect tag value in error message. + +### Security + +[Go 1.7.4](https://golang.org/doc/devel/release.html#go1.7.minor) was released to address two security issues. This release includes these security fixes. + +## v1.1.0 [2016-11-14] + +### Release Notes + +This release is built with GoLang 1.7.3 and provides many performance optimizations, stability changes and a few new query capabilities. If upgrading from a prior version, please read the configuration changes below section before upgrading. + +### Deprecations + +The admin interface is deprecated and will be removed in a subsequent release. +The configuration setting to enable the admin UI is now disabled by default, but can be enabled if necessary. +We recommend using [Chronograf](https://github.com/influxdata/chronograf) or [Grafana](https://github.com/grafana/grafana) as a replacement. + +### Configuration Changes + +The following configuration changes may need to changed before upgrading to `1.1.0` from prior versions. + +#### `[admin]` Section + +* `enabled` now default to false. If you are currently using the admin interaface, you will need to change this value to `true` to re-enable it. The admin interface is currently deprecated and will be removed in a subsequent release. + +#### `[data]` Section + +* `max-values-per-tag` was added with a default of 100,000, but can be disabled by setting it to `0`. Existing measurements with tags that exceed this limit will continue to load, but writes that would cause the tags cardinality to increase will be dropped and a `partial write` error will be returned to the caller. This limit can be used to prevent high cardinality tag values from being written to a measurement. +* `cache-max-memory-size` has been increased to from `524288000` to `1048576000`. This setting is the maximum amount of RAM, in bytes, a shard cache can use before it rejects writes with an error. Setting this value to `0` disables the limit. +* `cache-snapshot-write-cold-duration` has been decreased from `1h` to `10m`. This setting determines how long values will stay in the shard cache while the shard is cold for writes. +* `compact-full-write-cold-duration` has been decreased from `24h` to `4h`. The shorter duration allows cold shards to be compacted to an optimal state more quickly. + +### Features + +The query language has been extended with a few new features: + +- Support regular expressions on fields keys in select clause. +- New `linear` fill option. +- New `cumulative_sum` function. +- Support `ON` for `SHOW` commands. + +All Changes: + +- Filter out series within shards that do not have data for that series. +- Rewrite regular expressions of the form host = /^server-a$/ to host = 'server-a', to take advantage of the tsdb index. +- Improve compaction planning performance by caching tsm file stats. +- Align binary math expression streams by time. +- Reduce map allocations when computing the TagSet of a measurement. +- Make input plugin services open/close idempotent. +- Speed up shutdown by closing shards concurrently. +- Add sample function to query language. +- Add `fill(linear)` to query language. +- Implement cumulative_sum() function. +- Update defaults in config for latest best practices. +- UDP Client: Split large points. +- Add stats for active compactions, compaction errors. +- More man pages for the other tools we package and compress man pages fully. +- Add max-values-per-tag to limit high tag cardinality data. +- Update jwt-go dependency to version 3. +- Support enable HTTP service over unix domain socket. +- Add additional statistics to query executor. +- Feature request: `influx inspect -export` should dump WAL files. +- Implement text/csv content encoding for the response writer. +- Support tools for running async queries. +- Support ON and use default database for SHOW commands. +- Correctly read in input from a non-interactive stream for the CLI. +- Support `INFLUX_USERNAME` and `INFLUX_PASSWORD` for setting username/password in the CLI. +- Optimize first/last when no group by interval is present. +- Make regular expressions work on field and dimension keys in SELECT clause. +- Change default time boundaries for raw queries. +- Support mixed duration units. + +### Bugfixes + +- Avoid deadlock when `max-row-limit` is hit. +- Fix incorrect grouping when multiple aggregates are used with sparse data. +- Fix output duration units for SHOW QUERIES. +- Truncate the version string when linking to the documentation. +- influx_inspect: export does not escape field keys. +- Fix issue where point would be written to wrong shard. +- Fix retention policy inconsistencies. +- Remove accidentally added string support for the stddev call. +- Remove /data/process_continuous_queries endpoint. +- Enable https subscriptions to work with custom CA certificates. +- Reduce query planning allocations. +- Shard stats include WAL path tag so disk bytes make more sense. +- Panic with unread show series iterators during drop database. +- Use consistent column output from the CLI for column formatted responses. +- Correctly use password-type field in Admin UI. +- Duplicate parsing bug in ALTER RETENTION POLICY. +- Fix database locked up when deleting shards. +- Fix mmap dereferencing. +- Fix base64 encoding issue with /debug/vars stats. +- Drop measurement causes cache max memory exceeded error. +- Decrement number of measurements only once when deleting the last series from a measurement. +- Delete statement returns an error when retention policy or database is specified. +- Fix the dollar sign so it properly handles reserved keywords. +- Exceeding max retention policy duration gives incorrect error message. +- Drop time when used as a tag or field key. + +## v1.0.2 [2016-10-05] + +### Bugfixes + +- Fix RLE integer decoding producing negative numbers. +- Avoid stat syscall when planning compactions. +- Subscription data loss under high write load. +- Do not automatically reset the shard duration when using ALTER RETENTION POLICY. +- Ensure correct shard groups created when retention policy has been altered. + +## v1.0.1 [2016-09-26] + +### Bugfixes + +- Prevent users from manually using system queries since incorrect use would result in a panic. +- Ensure fieldsCreated stat available in shard measurement. +- Report cmdline and memstats in /debug/vars. +- Fixing typo within example configuration file. +- Implement time math for lazy time literals. +- Fix database locked up when deleting shards. +- Skip past points at the same time in derivative call within a merged series. +- Read an invalid JSON response as an error in the Influx client. + +## v1.0.0 [2016-09-08] + +### Release Notes +Inital release of InfluxDB. + +### Breaking changes + +* `max-series-per-database` was added with a default of 1M but can be disabled by setting it to `0`. Existing databases with series that exceed this limit will continue to load but writes that would create new series will fail. +* Config option `[cluster]` has been replaced with `[coordinator]`. +* Support for config options `[collectd]` and `[opentsdb]` has been removed; use `[[collectd]]` and `[[opentsdb]]` instead. +* Config option `data-logging-enabled` within the `[data]` section, has been renamed to `trace-logging-enabled`, and defaults to `false`. +* The keywords `IF`, `EXISTS`, and `NOT` where removed for this release. This means you no longer need to specify `IF NOT EXISTS` for `DROP DATABASE` or `IF EXISTS` for `CREATE DATABASE`. If these are specified, a query parse error is returned. +* The Shard `writePointsFail` stat has been renamed to `writePointsErr` for consistency with other stats. + +With this release the systemd configuration files for InfluxDB will use the system configured default for logging and will no longer write files to `/var/log/influxdb` by default. On most systems, the logs will be directed to the systemd journal and can be accessed by `journalctl -u influxdb.service`. Consult the systemd journald documentation for configuring journald. + +### Features + +- Add mode function. +- Support negative timestamps for the query engine. +- Write path stats. +- Add MaxSeriesPerDatabase config setting. +- Remove IF EXISTS/IF NOT EXISTS from influxql language. +- Update go package library dependencies. +- Add tsm file export to influx_inspect tool. +- Create man pages for commands. +- Return 403 Forbidden when authentication succeeds but authorization fails. +- Added favicon. +- Run continuous query for multiple buckets rather than one per bucket. +- Log the CQ execution time when continuous query logging is enabled. +- Trim BOM from Windows Notepad-saved config files. +- Update help and remove unused config options from the configuration file. +- Add NodeID to execution options. +- Make httpd logger closer to Common (& combined) Log Format. +- Allow any variant of the help option to trigger the help. +- Reduce allocations during query parsing. +- Optimize timestamp run-length decoding. +- Adds monitoring statistic for on-disk shard size. +- Add HTTP(s) based subscriptions. +- Add new HTTP statistics to monitoring. +- Speed up drop database. +- Add Holt-Winter forecasting function. +- Add support for JWT token authentication. +- Add ability to create snapshots of shards. +- Parallelize iterators. +- Teach the http service how to enforce connection limits. +- Support cast syntax for selecting a specific type. +- Refactor monitor service to avoid expvar and write monitor statistics on a truncated time interval. +- Dynamically update the documentation link in the admin UI. +- Support wildcards in aggregate functions. +- Support specifying a retention policy for the graphite service. +- Add extra trace logging to tsm engine. +- Add stats and diagnostics to the TSM engine. +- Support regex selection in SHOW TAG VALUES for the key. +- Modify the default retention policy name and make it configurable. +- Update SHOW FIELD KEYS to return the field type with the field key. +- Support bound parameters in the parser. +- Add https-private-key option to httpd config. +- Support loading a folder for collectd typesdb files. + +### Bugfixes + +- Optimize queries that compare a tag value to an empty string. +- Allow blank lines in the line protocol input. +- Runtime: goroutine stack exceeds 1000000000-byte limit. +- Fix alter retention policy when all options are used. +- Concurrent series limit. +- Ensure gzip writer is closed in influx_inspect export. +- Fix CREATE DATABASE when dealing with default values. +- Fix UDP pointsRx being incremented twice. +- Tombstone memory improvements. +- Hardcode auto generated RP names to autogen. +- Ensure IDs can't clash when managing Continuous Queries. +- Continuous full compactions. +- Remove limiter from walkShards. +- Copy tags in influx_stress to avoid a concurrent write panic on a map. +- Do not run continuous queries that have no time span. +- Move the CQ interval by the group by offset. +- Fix panic parsing empty key. +- Update connection settings when changing hosts in CLI. +- Always use the demo config when outputting a new config. +- Minor improvements to init script. Removes sysvinit-utils as package dependency. +- Fix compaction planning with large TSM files. +- Duplicate data for the same timestamp. +- Fix panic: truncate the slice when merging the caches. +- Fix regex binary encoding for a measurement. +- Fix fill(previous) when used with math operators. +- Rename dumptsmdev to dumptsm in influx_inspect. +- Remove a double lock in the tsm1 index writer. +- Remove FieldCodec from TSDB package. +- Allow a non-admin to call "use" for the influx CLI. +- Set the condition cursor instead of aux iterator when creating a nil condition cursor. +- Update `stress/v2` to work with clusters, ssl, and username/password auth. Code cleanup. +- Modify the max nanosecond time to be one nanosecond less. +- Include sysvinit-tools as an rpm dependency. +- Add port to all graphite log output to help with debugging multiple endpoints. +- Fix panic: runtime error: index out of range. +- Remove systemd output redirection. +- Database unresponsive after DROP MEASUREMENT. +- Address Out of Memory Error when Dropping Measurement. +- Fix the point validation parser to identify and sort tags correctly. +- Prevent panic in concurrent auth cache write. +- Set X-Influxdb-Version header on every request (even 404 requests). +- Prevent panic if there are no values. +- Time sorting broken with overwritten points. +- queries with strings that look like dates end up with date types, not string types. +- Concurrent map read write panic. +- Drop writes from before the retention policy time window. +- Fix SELECT statement required privileges. +- Filter out sources that do not match the shard database/retention policy. +- Truncate the shard group end time if it exceeds MaxNanoTime. +- Batch SELECT INTO / CQ writes. +- Fix compaction planning re-compacting large TSM files. +- Ensure client sends correct precision when inserting points. +- Accept points with trailing whitespace. +- Fix panic in SHOW FIELD KEYS. +- Disable limit optimization when using an aggregate. +- Fix panic: interface conversion: tsm1.Value is \*tsm1.StringValue, not \*tsm1.FloatValue. +- Data race when dropping a database immediately after writing to it. +- Make sure admin exists before authenticating query. +- Print the query executor's stack trace on a panic to the log. +- Fix read tombstones: EOF. +- Query-log-enabled in config not ignored anymore. +- Ensure clients requesting gzip encoded bodies don't receive empty body. +- Optimize shard loading. +- Queries slow down hundreds times after overwriting points. +- SHOW TAG VALUES accepts != and !~ in WHERE clause. +- Remove old cluster code. +- Ensure that future points considered in SHOW queries. +- Fix full compactions conflicting with level compactions. +- Overwriting points on large series can cause memory spikes during compactions. +- Fix parseFill to check for fill ident before attempting to parse an expression. +- Max index entries exceeded. +- Address slow startup time. +- Fix measurement field panic in tsm1 engine. +- Queries against files that have just been compacted need to point to new files. +- Check that retention policies exist before creating CQ. diff --git a/content/influxdb/v1.3/administration/_index.md b/content/influxdb/v1.3/administration/_index.md new file mode 100644 index 000000000..33c9fd113 --- /dev/null +++ b/content/influxdb/v1.3/administration/_index.md @@ -0,0 +1,33 @@ +--- +title: Administering InfluxDB +menu: + influxdb_1_3: + name: Administration + weight: 50 +--- + +The administration documentation contains all the information needed to administer a working InfluxDB installation. + +## [Logs](/influxdb/v1.3/administration/logs/) + +Information on how to direct InfluxDB log output. + +## [Ports](/influxdb/v1.3/administration/ports/) + +## [Backup and Restore](/influxdb/v1.3/administration/backup_and_restore/) + +Procedures to backup data created by InfluxDB and to restore from a backup. + +## [Differences between InfluxDB 1.3 and 1.2](/influxdb/v1.3/administration/differences/) + +## [Differences between InfluxDB 1.3 and versions prior to 1.2](/influxdb/v1.3/administration/previous_differences/) + +## [Upgrading](/influxdb/v1.3/administration/upgrading/) + +Information about upgrading from previous versions of InfluxDB + +## [Configuration](/influxdb/v1.3/administration/config/) + +Information about the config file `influx.conf` + +## [Stability and Compatibility](/influxdb/v1.3/administration/stability_and_compatibility/) diff --git a/content/influxdb/v1.3/administration/backup_and_restore.md b/content/influxdb/v1.3/administration/backup_and_restore.md new file mode 100644 index 000000000..6ad385cd5 --- /dev/null +++ b/content/influxdb/v1.3/administration/backup_and_restore.md @@ -0,0 +1,197 @@ +--- +title: Backup and Restore + +menu: + influxdb_1_3: + weight: 30 + parent: Administration +--- + +## Backups + +InfluxDB has the ability to snapshot an instance at a point-in-time and restore it. +All backups are full backups. +InfluxDB does not yet support incremental backups. +There are two types of data to backup, the metastore and the metrics themselves. +The [metastore](/influxdb/v1.3/concepts/glossary/#metastore) is backed up in its entirety. +The metrics are backed up per-database in a separate operation from the metastore backup. + +> **Note:** Backups are not interchangeable between InfluxDB OSS and [InfluxEnterprise](/enterprise/v1.3/). +You cannot restore an OSS backup to an InfluxEnterprise data node, nor can you restore +an InfluxEnterprise backup to an OSS instance. +> +If you are working with an InfluxEnterprise cluster, please see the [Backup +and Restore Guide](/enterprise/v1.3/guides/backup-and-restore/) in the +InfluxEnterprise documentation. + +### Backing up the Metastore + +InfluxDB's metastore contains internal information about the status of +the system, including user information, database/shard metadata, CQs, RPs, +and subscriptions. While a node is running, you can +create a backup of your instance's metastore by running the command: + +``` +influxd backup +``` + +Where `path-to-backup` can be replaced with the directory where you +would like the backup to be written to. Without any other arguments, +the backup will only record the current state of the system +metastore. For example, the command: + +```bash +$ influxd backup /tmp/backup +2016/02/01 17:15:03 backing up metastore to /tmp/backup/meta.00 +2016/02/01 17:15:03 backup complete +``` + +Will create a metastore backup in the directory `/tmp/backup` (the +directory will be created if it doesn't already exist). + +### Backing up a Database + +Each database must be backed up individually. + +To backup a database, you will need to add the `-database` flag: + +```bash +influxd backup -database +``` + +Where `mydatabase` is the name of the database you would like to +backup, and `path-to-backup` is where the backup data should be +stored. Optional flags also include: + +- `-retention ` - This flag can be used to + backup a specific retention policy. For more information on + retention policies, please see + [here](/influxdb/v1.3/query_language/database_management/#retention-policy-management). If + not specified, all retention policies will be backed up. + +- `-shard ` - This flag can be used to backup a specific + shard ID. To see which shards are available, you can run the command + `SHOW SHARDS` using the InfluxDB query language. If not specified, + all shards will be backed up. + +- `-since ` - This flag can be used to create a backup _since_ a + specific date, where the date must be in + [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) format (for example, + `2015-12-24T08:12:23Z`). This flag is important if you would like to + take incremental backups of your database. If not specified, all + timeranges within the database will be backed up. + +> **Note:** Metastore backups are also included in per-database backups + +As a real-world example, you can take a backup of the `autogen` +retention policy for the `telegraf` database since midnight UTC on +February 1st, 2016 by using the command: + +``` +$ influxd backup -database telegraf -retention autogen -since 2016-02-01T00:00:00Z /tmp/backup +2016/02/01 18:02:36 backing up rp=default since 2016-02-01 00:00:00 +0000 UTC +2016/02/01 18:02:36 backing up metastore to /tmp/backup/meta.01 +2016/02/01 18:02:36 backing up db=telegraf rp=default shard=2 to /tmp/backup/telegraf.default.00002.01 since 2016-02-01 00:00:00 +0000 UTC +2016/02/01 18:02:36 backup complete +``` + +Which will send the resulting backup to `/tmp/backup`, where it can +then be compressed and sent to long-term storage. + +### Remote Backups + +To capture a backup from a remote node: + +**1.** Uncomment the [`bind-address` configuration setting](/influxdb/v1.3/administration/config/#bind-address-127-0-0-1-8088) on the remote node +**2.** Update the `bind-address` setting to `:8088` +**3.** Run the following command from your local node: + +``` +$ influxd backup -database mydatabase -host :8088 /tmp/mysnapshot +``` + +## Restore + +To restore a backup, you will need to use the `influxd restore` command. + +> **Note:** Restoring from backup is only supported while the InfluxDB daemon is stopped. + +To restore from a backup you will need to specify the type of backup, +the path to where the backup should be restored, and the path to the backup. +The command: + +``` +influxd restore [ -metadir | -datadir ] +``` + +The required flags for restoring a backup are: + +- `-metadir ` - This is the path to the meta + directory where you would like the metastore backup recovered + to. For packaged installations, this should be specified as + `/var/lib/influxdb/meta`. + +- `-datadir ` - This is the path to the data + directory where you would like the database backup recovered to. For + packaged installations, this should be specified as + `/var/lib/influxdb/data`. + +The optional flags for restoring a backup are: + +- `-database ` - This is the database that you would like to + restore the data to. This option is required if no `-metadir` option + is provided. + +- `-retention ` - This is the target retention policy + for the stored data to be restored to. + +- `-shard ` - This is the shard data that should be + restored. If specified, `-database` and `-retention` must also be + set. + +Following the backup example above, the backup can be restored in two +steps. First, the metastore needs to be restored so that InfluxDB +knows which databases exist: + +``` +$ influxd restore -metadir /var/lib/influxdb/meta /tmp/backup +Using metastore snapshot: /tmp/backup/meta.00 +``` + +Once the metastore has been restored, we can now recover the backed up +data. In the real-world example above, we backed up the `telegraf` +database to `/tmp/backup`, so let's restore that same dataset. To +restore the `telegraf` database: + +``` +$ influxd restore -database telegraf -datadir /var/lib/influxdb/data /tmp/backup +Restoring from backup /tmp/backup/telegraf.* +unpacking /var/lib/influxdb/data/telegraf/default/2/000000004-000000003.tsm +unpacking /var/lib/influxdb/data/telegraf/default/2/000000005-000000001.tsm +``` + +> **Note:** Once the backed up data has been recovered, the +permissions on the shards may no longer be accurate. To ensure +the file permissions are correct, please run: + +> `$ sudo chown -R influxdb:influxdb /var/lib/influxdb` + +Once the data and metastore are recovered, it's time to start the database: + +```bash +$ service influxdb start +``` + +As a quick check, we can verify the database is known to the metastore +by running a `SHOW DATABASES` command: + +``` +influx -execute 'show databases' +name: databases +--------------- +name +_internal +telegraf +``` + +The database has now been successfully restored! diff --git a/content/influxdb/v1.3/administration/config.md b/content/influxdb/v1.3/administration/config.md new file mode 100644 index 000000000..9be5bdaa0 --- /dev/null +++ b/content/influxdb/v1.3/administration/config.md @@ -0,0 +1,1047 @@ +--- +title: Configuration +menu: + influxdb_1_3: + weight: 70 + parent: Administration +--- + +The InfluxDB configuration file contains configuration settings specific to a local node. + +#### Content + +* [Using configuration files](#using-configuration-files) + * [Configuration options overview](#configuration-options-overview) + * [Environment variables](#environment-variables) + +_**Configuration options by section**_ + +* [Global options](#global-options) + * [reporting-disabled](#reporting-disabled-false) + * [bind-address](#bind-address-127-0-0-1-8088) + * [GOMAXPROCS](#gomaxprocs) +* [[meta]](#meta) + * [dir](#dir-var-lib-influxdb-meta) + * [retention-autocreate](#retention-autocreate-true) + * [logging-enabled](#logging-enabled-true) +* [[data]](#data) + * [dir](#dir-var-lib-influxdb-data) + * [index-version](#index-version-inmem) + * [wal-dir](#wal-dir-var-lib-influxdb-wal) + * [wal-fsync-delay](#wal-fsync-delay-0s) + * [trace-logging-enabled](#trace-logging-enabled-false) + * [query-log-enabled](#query-log-enabled-true) + * [cache-max-memory-size](#cache-max-memory-size-1073741824) + * [cache-snapshot-memory-size](#cache-snapshot-memory-size-26214400) + * [cache-snapshot-write-cold-duration](#cache-snapshot-write-cold-duration-10m) + * [compact-full-write-cold-duration](#compact-full-write-cold-duration-4h) + * [max-concurrent-compactions](#max-concurrent-compactions-0) + * [max-series-per-database](#max-series-per-database-1000000) + * [max-values-per-tag](#max-values-per-tag-100000) +* [[coordinator]](#coordinator) + * [write-timeout](#write-timeout-10s) + * [max-concurrent-queries](#max-concurrent-queries-0) + * [query-timeout](#query-timeout-0s) + * [log-queries-after](#log-queries-after-0s) + * [max-select-point](#max-select-point-0) + * [max-select-series](#max-select-series-0) + * [max-select-buckets](#max-select-buckets-0) +* [[retention]](#retention) + * [enabled](#enabled-true) + * [check-interval](#check-interval-30m0s) +* [[shard-precreation]](#shard-precreation) + * [enabled](#enabled-true-1) + * [check-interval](#check-interval-10m) + * [advance-period](#advance-period-30m) +* [[admin]](#admin) +* [[monitor]](#monitor) + * [store-enabled](#store-enabled-true) + * [store-database](#store-database-internal) + * [store-interval](#store-interval-10s) +* [[subscriber]](#subscriber) + * [enabled](#enabled-true-3) + * [http-timeout](#http-timeout-30s) + * [insecure-skip-verify](#insecure-skip-verify-false) + * [ca-certs](#ca-certs) + * [write-concurrency](#write-concurrency-40) + * [write-buffer-size](#write-buffer-size-1000) +* [[http]](#http) + * [enabled](#enabled-true-2) + * [bind-address](#bind-address-8086) + * [auth-enabled](#auth-enabled-false) + * [realm](#realm-influxdb) + * [log-enabled](#log-enabled-true) + * [write-tracing](#write-tracing-false) + * [pprof-enabled](#pprof-enabled-true) + * [https-enabled](#https-enabled-false) + * [https-certificate](#https-certificate-etc-ssl-influxdb-pem) + * [https-private-key](#https-private-key) + * [shared-secret](#shared-secret) + * [max-row-limit](#max-row-limit-0) + * [max-connection-limit](#max-connection-limit-0) + * [unix-socket-enabled](#unix-socket-enabled-false) + * [bind-socket](#bind-socket-var-run-influxdb-sock) + * [max-body-size](#max-body-size-25000000) +* [[[graphite]]](#graphite) + * [enabled](#enabled-false-1) + * [database](#database-graphite) + * [retention-policy](#retention-policy) + * [bind-address](#bind-address-2003) + * [protocol](#protocol-tcp) + * [consistency-level](#consistency-level-one) + * [batch-size](#batch-size-5000) + * [batch-pending](#batch-pending-10) + * [batch-timeout](#batch-timeout-1s) + * [udp-read-buffer](#udp-read-buffer-0) + * [separator](#separator) +* [[[collectd]]](#collectd) + * [enabled](#enabled-false-2) + * [bind-address](#bind-address-25826) + * [database](#database-collectd) + * [retention-policy](#retention-policy-1) + * [typesdb](#typesdb-usr-local-share-collectd) + * [security-level](#security-level-none) + * [auth-file](#auth-file-etc-collectd-auth-file) + * [batch-size](#batch-size-5000-1) + * [batch-pending](#batch-pending-10-1) + * [batch-timeout](#batch-timeout-10s) + * [read-buffer](#read-buffer-0) +* [[[opentsdb]]](#opentsdb) + * [enabled](#enabled-false-3) + * [bind-address](#bind-address-4242) + * [database](#database-opentsdb) + * [retention-policy](#retention-policy-2) + * [consistency-level](#consistency-level-one-1) + * [tls-enabled](#tls-enabled-false) + * [certificate](#certificate-etc-ssl-influxdb-pem) + * [log-point-errors](#log-point-errors-true) + * [batch-size](#batch-size-1000) + * [batch-pending](#batch-pending-5) + * [batch-timeout](#batch-timeout-1s-1) +* [[[udp]]](#udp) + * [enabled](##enabled-false-3) + * [bind-address](#bind-address-8089) + * [database](#database-udp) + * [retention-policy](#retention-policy-3) + * [batch-size](#batch-size-5000-2) + * [batch-pending](#batch-pending-10-2) + * [batch-timeout](#batch-timeout-1s-2) + * [read-buffer](#read-buffer-0-1) + * [precision](#precision) +* [[continuous_queries]](#continuous-queries) + * [enabled](#enabled-true-4) + * [log-enabled](#log-enabled-true-1) + * [query-stats-enabled](#query-stats-enabled-false) + * [run-interval](#run-interval-1s) + + +## Using Configuration Files + +The system has internal defaults for every configuration file setting. +View the default configuration settings with the `influxd config` command. + +Most of the settings in the local configuration file +(`/etc/influxdb/influxdb.conf`) are commented out; all +commented-out settings will be determined by the internal defaults. +Any uncommented settings in the local configuration file override the +internal defaults. +Note that the local configuration file does not need to include every +configuration setting. + +There are two ways to launch InfluxDB with your configuration file: + +* Point the process to the correct configuration file by using the `-config` +option: + + ```bash + influxd -config /etc/influxdb/influxdb.conf + ``` +* Set the environment variable `INFLUXDB_CONFIG_PATH` to the path of your +configuration file and start the process. +For example: + + ``` + echo $INFLUXDB_CONFIG_PATH + /etc/influxdb/influxdb.conf + + influxd + ``` + +InfluxDB first checks for the `-config` option and then for the environment +variable. + +### Configuration Options Overview + +Every configuration section has configuration options and every configuration option is optional. +If you do not uncomment a configuration option, the system uses its default setting. +The configuration options in this document are set to their default settings. + +Configuration options that specify a duration support the following duration units: + +`ns`        nanoseconds +`us` or `µs` microseconds +`ms`        milliseconds +`s`      seconds +`m`      minutes +`h`      hours +`d`      days +`w`      weeks + +>**Note:** This page documents configuration options for the latest official release - the [sample configuration file on GitHub](https://github.com/influxdb/influxdb/blob/1.3/etc/config.sample.toml) will always be slightly ahead of what is documented here. + +### Environment Variables + +All configuration options can be specified in the configuration file or in an +environment variable. +The environment variable overrides the equivalent option in the configuration +file. +If a configuration option is not specified in either the configuration file +or in an environment variable, InfluxDB uses its internal default +configuration. + +In the sections below we name the relevant environment variable in the +description for the configuration setting. + +> **Note:** +To set or override settings in a config section that allows multiple +configurations (any section with [[`double_brackets`]] in the header supports +multiple configurations), the desired configuration must be specified by ordinal +number. +For example, for the first set of `[[graphite]]` environment variables, +prefix the configuration setting name in the environment variable with the +relevant position number (in this case: `0`): +> + INFLUXDB_GRAPHITE_0_BATCH_PENDING + INFLUXDB_GRAPHITE_0_BATCH_SIZE + INFLUXDB_GRAPHITE_0_BATCH_TIMEOUT + INFLUXDB_GRAPHITE_0_BIND_ADDRESS + INFLUXDB_GRAPHITE_0_CONSISTENCY_LEVEL + INFLUXDB_GRAPHITE_0_DATABASE + INFLUXDB_GRAPHITE_0_ENABLED + INFLUXDB_GRAPHITE_0_PROTOCOL + INFLUXDB_GRAPHITE_0_RETENTION_POLICY + INFLUXDB_GRAPHITE_0_SEPARATOR + INFLUXDB_GRAPHITE_0_TAGS + INFLUXDB_GRAPHITE_0_TEMPLATES + INFLUXDB_GRAPHITE_0_UDP_READ_BUFFER +> +For the Nth Graphite configuration in the configuration file, the relevant +environment variables would be of the form `INFLUXDB_GRAPHITE_(N-1)_BATCH_PENDING`. +For each section of the configuration file the numbering restarts at zero. + +## Global Options + +### reporting-disabled = false + +InfluxData, the company, relies on reported data from running nodes +primarily to track the adoption rates of different InfluxDB versions. +This data helps InfluxData support the continuing development of +InfluxDB. + +The `reporting-disabled` option toggles +the reporting of data every 24 hours to `usage.influxdata.com`. +Each report includes a randomly-generated identifier, OS, architecture, +InfluxDB version, and the +number of [databases](/influxdb/v1.3/concepts/glossary/#database), +[measurements](/influxdb/v1.3/concepts/glossary/#measurement), and +unique [series](/influxdb/v1.3/concepts/glossary/#series). Setting +this option to `true` will disable reporting. + +>**Note:** No data from user databases is ever transmitted. + +Environment variable: `INFLUXDB_REPORTING_DISABLED` + +### bind-address = "127.0.0.1:8088" + +The bind address to use for the RPC service for [backup and restore](/influxdb/v1.3/administration/backup_and_restore/). + +Environment variable: `INFLUXDB_BIND_ADDRESS` + +### GOMAXPROCS + +GOMAXPROCS is a GoLang setting. + +The default value of GOMAXPROCS is the number of CPUs (whatever your operating system considers to be a CPU -- +this could be the number of cores i.e. GOMAXPROCS=32 for a 32 core machine) visible to the program *at startup.* +However, you can override this value to be less than the maxium value. +This can be important in cases where you are running the database alongside other processes on the same machine and +want to ensure that the database doesn't completely starve those those processes. + +Keep in mind that setting GOMAXPROCS=1 will eliminate all parallelization. + +Environment variable: `GOMAXPROCS` + +## [meta] + +This section controls parameters for InfluxDB's metastore, +which stores information on users, databases, retention policies, shards, and +continuous queries. + +### dir = "/var/lib/influxdb/meta" + +The `meta` directory. +Files in the `meta` directory include `meta.db`. + +>**Note:** The default directory for macOS installations is `/Users//.influxdb/meta` + +Environment variable: `INFLUXDB_META_DIR` + +### retention-autocreate = true + +Retention policy auto-creation automatically creates the [`DEFAULT` retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) `autogen` when a database is created. +The retention policy `autogen` has an infinite duration and is also set as the +database's `DEFAULT` retention policy, which is used when a write or query does +not specify a retention policy. +Disable this setting to prevent the creation of this retention policy when creating databases. + +Environment variable: `INFLUXDB_META_RETENTION_AUTOCREATE` + +### logging-enabled = true + +Meta logging toggles the logging of messages from the meta service. + +Environment variable: `INFLUXDB_META_LOGGING_ENABLED` + +## [data] + +This section controls where the actual shard data for InfluxDB lives and how it is flushed from the WAL. `dir` may need to be changed to a suitable place for you system, but the WAL settings are an advanced configuration. The defaults should work for most systems. + +### dir = "/var/lib/influxdb/data" + +The directory where InfluxDB stores the data. +This directory may be changed. + +>**Note:** The default directory for macOS installations is `/Users//.influxdb/data` + +Environment variable: `INFLUXDB_DATA_DIR` + +### index-version = "inmem" + +The type of shard index to use for new shards. +The default is an in-memory (TSM-based) index that is recreated at startup. +A value of `tsi1` will use a disk (TSI-based) index that supports higher cardinality datasets. +Existing in-memory (TSM-based) shards will continue to be used unless converted using the [`influx_inspect inmem2tsi`](/influxdb/v1.3/tools/influx_inspect/#influx-inspect-inmem2tsi) command. + +Environment variable: `INFLUXDB_DATA_INDEX_VERSION` + +### wal-dir = "/var/lib/influxdb/wal" + +The WAL directory is the location of the [write ahead log](/influxdb/v1.3/concepts/glossary/#wal-write-ahead-log). + +Environment variable: `INFLUXDB_DATA_WAL_DIR` + + +### wal-fsync-delay = "0s" + +The amount of time that a write waits before fsyncing. Use a duration greater than `0` to batch up multiple fsync calls. +This is useful for slower disks or when experiencing [WAL](/influxdb/v1.3/concepts/glossary/#wal-write-ahead-log) write contention. +A value of `0s` fsyncs every write to the WAL. +We recommend values in the range of `0ms`-`100ms` for non-SSD disks. + +Environment variable: `INFLUXDB_DATA_WAL_FSYNC_DELAY` + +### trace-logging-enabled = false + +Toggles logging of additional debug information within the TSM engine and WAL. + +Environment variable: `INFLUXDB_DATA_TRACE_LOGGING_ENABLED` + +### query-log-enabled = true + +The query log enabled setting toggles the logging of parsed queries before execution. +Very useful for troubleshooting, but will log any sensitive data contained within a query. + +Environment variable: `INFLUXDB_DATA_QUERY_LOG_ENABLED` + +### cache-max-memory-size = 1073741824 + +The cache maximum memory size is the maximum size (in bytes) a shard's cache can reach before it starts rejecting writes. + +Environment variable: `INFLUXDB_DATA_CACHE_MAX_MEMORY_SIZE` + +### cache-snapshot-memory-size = 26214400 + +The cache snapshot memory size is the size at which the engine will snapshot the cache and write it to a TSM file, freeing up memory. + +Environment variable: `INFLUXDB_DATA_CACHE_SNAPSHOT_MEMORY_SIZE` + +### cache-snapshot-write-cold-duration = "10m" + +The cache snapshot write cold duration is the length of time at which the engine will snapshot the cache and write it to a new TSM file if the shard hasn't received writes or deletes. + +Environment variable: `INFLUXDB_DATA_CACHE_SNAPSHOT_WRITE_COLD_DURATION` + +### compact-full-write-cold-duration = "4h" + +The compact full write cold duration is the duration at which the engine will compact all TSM files in a shard if it hasn't received a write or delete. + +Environment variable: `INFLUXDB_DATA_COMPACT_FULL_WRITE_COLD_DURATION` + +### max-concurrent-compactions = 0 + +The maximum number of concurrent full and level [compactions](/influxdb/v1.3/concepts/storage_engine/#compactions) that can run at one time. +A value of 0 results in runtime.GOMAXPROCS(0) used at runtime -- which means use all processors. +This setting does not apply to cache snapshotting. + +Environment variable: `INFLUXDB_DATA_MAX_CONCURRENT_COMPACTIONS` + +### max-series-per-database = 1000000 + +The maximum number of [series](/influxdb/v1.3/concepts/glossary/#series) allowed +per database. +The default setting is one million. +Change the setting to `0` to allow an unlimited number of series per database. + +If a point causes the number of series in a database to exceed +`max-series-per-database` InfluxDB will not write the point, and it returns a +`500` with the following error: + +``` +{"error":"max series per database exceeded: "} +``` + +> **Note:** Any existing databases with a series count that exceeds `max-series-per-database` +will continue to accept writes to existing series, but writes that create a +new series will fail. + +> **Note:** This setting is ignored when [index-version](#index-version-inmem) is set to `tsi1`. + +Environment variable: `INFLUXDB_DATA_MAX_SERIES_PER_DATABASE` + +### max-values-per-tag = 100000 + +The maximum number of [tag values](/influxdb/v1.3/concepts/glossary/#tag-value) +allowed per [tag key](/influxdb/v1.3/concepts/glossary/#tag-key). +The default setting is `100000`. +Change the setting to `0` to allow an unlimited number of tag values per tag +key. +If a tag value causes the number of tag values of a tag key to exceed +`max-values-per-tag` InfluxDB will not write the point, and it returns +a `partial write` error. + +Any existing tag keys with tag values that exceed `max-values-per-tag` +will continue to accept writes, but writes that create a new tag value +will fail. + +> **Note:** This setting is ignored when [index-version](#index-version-inmem) is set to `tsi1`. + +Environment variable: `INFLUXDB_DATA_MAX_VALUES_PER_TAG` + +## [coordinator] + +This section contains configuration options for query management. +For more on managing queries, see [Query Management](/influxdb/v1.3/troubleshooting/query_management/). + +### write-timeout = "10s" + +The time within which a write request must complete on the cluster. + +Environment variable: `INFLUXDB_COORDINATOR_WRITE_TIMEOUT` + +### max-concurrent-queries = 0 + +The maximum number of running queries allowed on your instance. +The default setting (`0`) allows for an unlimited number of queries. + +Environment variable: `INFLUXDB_COORDINATOR_MAX_CONCURRENT_QUERIES` + +### query-timeout = "0s" + +The maximum time for which a query can run on your instance before InfluxDB +kills the query. +The default setting (`0`) allows queries to run with no time restrictions. +This setting is a [duration](#configuration-options-overview). + +Environment variable: `INFLUXDB_COORDINATOR_QUERY_TIMEOUT` + +### log-queries-after = "0s" + +The maximum time a query can run after which InfluxDB logs the query with a +`Detected slow query` message. +The default setting (`"0"`) will never tell InfluxDB to log the query. +This setting is a +[duration](#configuration-options-overview). + +Environment variable: `INFLUXDB_COORDINATOR_LOG_QUERIES_AFTER` + +### max-select-point = 0 + +The maximum number of [points](/influxdb/v1.3/concepts/glossary/#point) that a +`SELECT` statement can process. +The default setting (`0`) allows the `SELECT` statement to process an unlimited +number of points. + +Environment variable: `INFLUXDB_COORDINATOR_MAX_SELECT_POINT` + +### max-select-series = 0 + +The maximum number of [series](/influxdb/v1.3/concepts/glossary/#series) that a +`SELECT` statement can process. +The default setting (`0`) allows the `SELECT` statement to process an unlimited +number of series. + +Environment variable: `INFLUXDB_COORDINATOR_MAX_SELECT_SERIES` + +### max-select-buckets = 0 + +The maximum number of `GROUP BY time()` buckets that a query can process. +The default setting (`0`) allows a query to process an unlimited number of +buckets. + +Environment variable: `INFLUXDB_COORDINATOR_MAX_SELECT_BUCKETS` + +## [retention] + +This section controls the enforcement of retention policies for evicting old data. + +### enabled = true + +Set to `false` to prevent InfluxDB from enforcing retention policies. + +Environment variable: `INFLUXDB_RETENTION_ENABLED` + +### check-interval = "30m0s" + +The rate at which InfluxDB checks to enforce a retention policy. + +Environment variable: `INFLUXDB_RETENTION_CHECK_INTERVAL` + +## [shard-precreation] + +Controls the precreation of shards so that shards are available before data arrive. +Only shards that, after creation, will have both a start- and end-time in the future are ever created. +Shards that would be wholly or partially in the past are never precreated. + +### enabled = true + +Environment variable: `INFLUXDB_SHARD_PRECREATION_ENABLED` + +### check-interval = "10m" + +Environment variable: `INFLUXDB_SHARD_PRECREATION_CHECK_INTERVAL` + +### advance-period = "30m" + +The maximum period in the future for which InfluxDB precreates shards. +The `30m` default should work for most systems. +Increasing this setting too far in the future can cause inefficiencies. + +Environment variable: `INFLUXDB_SHARD_PRECREATION_ADVANCE_PERIOD` + +## [monitor] + +This section controls InfluxDB's [system self-monitoring](https://github.com/influxdb/influxdb/blob/1.3/monitor/README.md). + +By default, InfluxDB writes the data to the `_internal` database. +If that database does not exist, InfluxDB creates it automatically. +The `DEFAULT` retention policy on the `_internal` database is seven days. +If you want to use a retention policy other than the seven-day retention policy, you must [create](/influxdb/v1.3/query_language/database_management/#retention-policy-management) it. + +### store-enabled = true + +Set to `false` to disable recording statistics internally. +If set to `false` it will make it substantially more difficult to diagnose issues with your installation. + +Environment variable: `INFLUXDB_MONITOR_STORE_ENABLED` + +### store-database = "\_internal" + +The destination database for recorded statistics. + +Environment variable: `INFLUXDB_MONITOR_STORE_DATABASE` + +### store-interval = "10s" + +The interval at which InfluxDB records statistics. + +Environment variable: `INFLUXDB_MONITOR_STORE_INTERVAL` + +## [admin] + +{{% warn %}} In version 1.3, the web admin interface is no longer available in InfluxDB. +The interface does not run on port `8083` and InfluxDB ignores the `[admin]` section in the configuration file if that section is present. +[Chronograf](/chronograf/v1.3/) replaces the web admin interface with improved tooling for querying data, writing data, and database management. +See [Chronograf's transition guide](/chronograf/v1.3/guides/transition-web-admin-interface/) for more information. +{{% /warn %}} + +## [http] + +This section controls how InfluxDB configures the HTTP endpoints. +These are the primary mechanisms for getting data into and out of InfluxDB. +Edit the options in this section to enable HTTPS and authentication. +See [Authentication and Authorization](/influxdb/v1.3/query_language/authentication_and_authorization/). + +### enabled = true + +Set to `false` to disable HTTP. +Note that the InfluxDB [command line interface (CLI)](/influxdb/v1.3/tools/shell/) connects to the database using the HTTP API. + +Environment variable: `INFLUXDB_HTTP_ENABLED` + +### bind-address = ":8086" + +The port used by the HTTP API. + +Environment variable: `INFLUXDB_HTTP_BIND_ADDRESS` + +### auth-enabled = false + +Set to `true` to require authentication. + +Environment variable: `INFLUXDB_HTTP_AUTH_ENABLED` + +### realm = "InfluxDB" + +Realm is the JWT realm used by the http endpoint. + +Environment variable: `INFLUXDB_HTTP_REALM` + +### log-enabled = true + +Set to `false` to disable logging. + +Environment variable: `INFLUXDB_HTTP_LOG_ENABLED` + +### write-tracing = false + +Set to `true` to enable logging for the write payload. +If set to `true`, this will duplicate every write statement in the logs and is thus not recommended for general use. + +Environment variable: `INFLUXDB_HTTP_WRITE_TRACING` + +### pprof-enabled = true + +Determines whether the pprof endpoint is enabled. This endpoint is used for +troubleshooting and monitoring. + +Environment variable: `INFLUXDB_HTTP_PPROF_ENABLED` + +### https-enabled = false + +Set to `true` to enable HTTPS. + +Environment variable: `INFLUXDB_HTTP_HTTPS_ENABLED` + +### https-certificate = "/etc/ssl/influxdb.pem" + +The path of the certificate file. + +Environment variable: `INFLUXDB_HTTP_HTTPS_CERTIFICATE` + +### https-private-key = "" + +The separate private key location. +If only the `https-certificate` is specified, the httpd service will try to load +the private key from the `https-certificate` file. +If a separate `https-private-key` file is specified, the httpd service will load +the private key from the `https-private-key` file. + +Environment variable: `INFLUXDB_HTTP_HTTPS_PRIVATE_KEY` + +### shared-secret = "" + +The shared secret used for JWT signing. + +Environment variable: `INFLUXDB_HTTP_SHARED_SECRET` + +### max-row-limit = 0 + +Limits the number of rows that the system can return in a [non-chunked](/influxdb/v1.3/tools/api/#query-string-parameters) query. +The default setting (`0`) allows for an unlimited number of rows. +InfluxDB includes a `"partial":true` tag in the response body if query results exceed the `max-row-limit` setting. + +Environment variable: `INFLUXDB_HTTP_MAX_ROW_LIMIT` + +### max-connection-limit = 0 + +Limit the number of connections for the http service. 0 is unlimited. + +Environment variable: `INFLUXDB_HTTP_MAX_CONNECTION_LIMIT` + +### unix-socket-enabled = false + +Set to `true` to enable http service over unix domain socket. + +Environment variable: `INFLUXDB_HTTP_UNIX_SOCKET_ENABLED` + +### bind-socket = "/var/run/influxdb.sock" + +The path of the unix domain socket. + +Environment variable: `INFLUXDB_HTTP_UNIX_BIND_SOCKET` + +### max-body-size = 25000000 + +Specifies the maximum size (in bytes) of a client request body. When a client sends data that exceeds the configured +maximum size, a 413 Request Entity Too Large HTTP response is returned. This can be disabled by setting it to 0. + +Environment variable: `INFLUXDB_HTTP_MAX_BODY_SIZE` + +## [subscriber] + +This section controls how [Kapacitor](/kapacitor/v1.3/) will receive data. + +### enabled = true + +Set to `false` to disable the subscriber service. + +Environment variable: `INFLUXDB_SUBSCRIBER_ENABLED` + +### http-timeout = "30s" + +Controls how long an http request for the subscriber service will run before it times out. + +Environment variable: `INFLUXDB_SUBSCRIBER_HTTP_TIMEOUT` + +### insecure-skip-verify = false + +Allows insecure HTTPS connections to subscribers. +This is useful when testing with self-signed certificates. + +Environment variable: `INFLUXDB_SUBSCRIBER_INSECURE_SKIP_VERIFY` + +### ca-certs = "" + +The path to the PEM encoded CA certs file. +If the empty string, the default system certs will be used. + +Environment variable: `INFLUXDB_SUBSCRIBER_CA_CERTS` + +### write-concurrency = 40 + +The number of writer goroutines processing the write channel. + +Environment variable: `INFLUXDB_SUBSCRIBER_WRITE_CONCURRENCY` + +### write-buffer-size = 1000 + +The number of in-flight writes buffered in the write channel. + +Environment variable: `INFLUXDB_SUBSCRIBER_WRITE_BUFFER_SIZE` + +## [[graphite]] + +This section controls one or many listeners for Graphite data. +See the [README](https://github.com/influxdb/influxdb/blob/1.3/services/graphite/README.md) on GitHub for more information. + +### enabled = false + +Set to `true` to enable Graphite input. + +Environment variable: `INFLUXDB_GRAPHITE_0_ENABLED` + +### database = "graphite" + +The name of the database that you want to write to. + +Environment variable: `INFLUXDB_GRAPHITE_0_DATABASE` + +### retention-policy = "" + +The relevant retention policy. +An empty string is equivalent to the database's `DEFAULT` retention policy. + +Environment variable: `INFLUXDB_GRAPHITE_0_RETENTION_POLICY` + +### bind-address = ":2003" + +The default port. + +Environment variable: `INFLUXDB_GRAPHITE_0_BIND_ADDRESS` + +### protocol = "tcp" + +Set to `tcp` or `udp`. + +Environment variable: `INFLUXDB_GRAPHITE_PROTOCOL` + +### consistency-level = "one" + +The number of nodes that must confirm the write. +If the requirement is not met the return value will be either `partial write` if some points in the batch fail or `write failure` if all points in the batch fail. +For more information, see the Query String Parameters for Writes section in the [Line Protocol Syntax Reference ](/influxdb/v1.3/write_protocols/write_syntax/). + +Environment variable: `INFLUXDB_GRAPHITE_CONSISTENCY_LEVEL` + +*The next three options control how batching works. +You should have this enabled otherwise you could get dropped metrics or poor performance. +Batching will buffer points in memory if you have many coming in.* + +### batch-size = 5000 + +The input will flush if this many points get buffered. + +Environment variable: `INFLUXDB_GRAPHITE_BATCH_SIZE` + +### batch-pending = 10 + +The number of batches that may be pending in memory. + +Environment variable: `INFLUXDB_GRAPHITE_BATCH_PENDING` + +### batch-timeout = "1s" + +The input will flush at least this often even if it hasn't reached the configured batch-size. + +Environment variable: `INFLUXDB_GRAPHITE_BATCH_TIMEOUT` + +### udp-read-buffer = 0 + +UDP Read buffer size, 0 means OS default. +UDP listener will fail if set above OS max. + +Environment variable: `INFLUXDB_GRAPHITE_UDP_READ_BUFFER` + +### separator = "." + +This string joins multiple matching 'measurement' values providing more control over the final measurement name. + +Environment variable: `INFLUXDB_GRAPHITE_SEPARATOR` + +## [[collectd]] + +This section controls the listener for collectd data. See the +[README](https://github.com/influxdata/influxdb/tree/master/services/collectd) +on Github for more information. + +### enabled = false + +Set to `true` to enable collectd writes. + +Environment variable: `INFLUXDB_COLLECTD_ENABLED` + +### bind-address = ":25826" + +The port. + +Environment variable: `INFLUXDB_COLLECTD_BIND_ADDRESS` + +### database = "collectd" + +The name of the database that you want to write to. +This defaults to `collectd`. + +Environment variable: `INFLUXDB_COLLECTD_DATABASE` + +### retention-policy = "" + +The relevant retention policy. +An empty string is equivalent to the database's `DEFAULT` retention policy. + +Environment variable: `INFLUXDB_COLLECTD_RETENTION_POLICY` + +### typesdb = "/usr/local/share/collectd" + +The collectd service supports either scanning a directory for multiple types +db files, or specifying a single db file. +A sample `types.db` file +can be found +[here](https://github.com/collectd/collectd/blob/master/src/types.db). + +Environment variable: `INFLUXDB_COLLECTD_TYPESDB` + +### security-level = "none" + +Environment variable: `INFLUXDB_COLLECTD_SECURITY_LEVEL` + +### auth-file = "/etc/collectd/auth_file" + +Environment variable: `INFLUXDB_COLLECTD_AUTH_FILE` + +*The next three options control how batching works. +You should have this enabled otherwise you could get dropped metrics or poor performance. +Batching will buffer points in memory if you have many coming in.* + +### batch-size = 5000 + +The input will flush if this many points get buffered. + +Environment variable: `INFLUXDB_COLLECTD_BATCH_SIZE` + +### batch-pending = 10 + +The number of batches that may be pending in memory. + +Environment variable: `INFLUXDB_COLLECTD_BATCH_PENDING` + +### batch-timeout = "10s" + +The input will flush at least this often even if it hasn't reached the configured batch-size. + +Environment variable: `INFLUXDB_COLLECTD_BATCH_TIMEOUT` + +### read-buffer = 0 + +UDP Read buffer size, 0 means OS default. +UDP listener will fail if set above OS max. + +Environment variable: `INFLUXDB_COLLECTD_READ_BUFFER` + + +## [[opentsdb]] + +Controls the listener for OpenTSDB data. +See the [README](https://github.com/influxdb/influxdb/blob/1.3/services/opentsdb/README.md) on GitHub for more information. + +### enabled = false + +Set to `true` to enable openTSDB writes. + +Environment variable: `INFLUXDB_OPENTSDB_0_ENABLED` + +### bind-address = ":4242" + +The default port. + +Environment variable: `INFLUXDB_OPENTSDB_BIND_ADDRESS` + +### database = "opentsdb" + +The name of the database that you want to write to. +If the database does not exist, it will be created automatically when the input is initialized. + +Environment variable: `INFLUXDB_OPENTSDB_DATABASE` + +### retention-policy = "" + +The relevant retention policy. +An empty string is equivalent to the database's `DEFAULT` retention policy. + +Environment variable: `INFLUXDB_OPENTSDB_RETENTION_POLICY` + +### consistency-level = "one" + +Sets the write consistency level: `any`, `one`, `quorum`, or `all` for writes. + +Environment variable: `INFLUXDB_OPENTSDB_CONSISTENCY_LEVEL` + +### tls-enabled = false + +Environment variable: `INFLUXDB_OPENTSDB_TLS_ENABLED` + +### certificate = "/etc/ssl/influxdb.pem" + +Environment variable: `INFLUXDB_OPENTSDB_CERTIFICATE` + +### log-point-errors = true + +Log an error for every malformed point. + +Environment variable: `INFLUXDB_OPENTSDB_0_LOG_POINT_ERRORS` + +*The next three options control how batching works. +You should have this enabled otherwise you could get dropped metrics or poor performance. +Only points metrics received over the telnet protocol undergo batching.* + +### batch-size = 1000 + +The input will flush if this many points get buffered. + +Environment variable: `INFLUXDB_OPENTSDB_BATCH_SIZE` + +### batch-pending = 5 + +The number of batches that may be pending in memory. + +Environment variable: `INFLUXDB_OPENTSDB_BATCH_PENDING` + +### batch-timeout = "1s" + +The input will flush at least this often even if it hasn't reached the configured batch-size. + +Environment variable: `INFLUXDB_OPENTSDB_BATCH_TIMEOUT` + +## [[udp]] + +This section controls the listeners for InfluxDB line protocol data via UDP. +See the [UDP page](/influxdb/v1.3/write_protocols/udp/) for more information. + +### enabled = false + +Set to `true` to enable writes over UDP. + +Environment variable: `INFLUXDB_UDP_ENABLED` + +### bind-address = ":8089" + +An empty string is equivalent to `0.0.0.0`. + +Environment variable: `INFLUXDB_UDP_BIND_ADDRESS` + +### database = "udp" + +The name of the database that you want to write to. + +Environment variable: `INFLUXDB_UDP_DATABASE` + +### retention-policy = "" + +The relevant retention policy for your data. +An empty string is equivalent to the database's `DEFAULT` retention policy. + +Environment variable: `INFLUXDB_UDP_RETENTION_POLICY` + +*The next three options control how batching works. +You should have this enabled otherwise you could get dropped metrics or poor performance. +Batching will buffer points in memory if you have many coming in.* + +### batch-size = 5000 + +The input will flush if this many points get buffered. + +Environment variable: `INFLUXDB_UDP_0_BATCH_SIZE` + +### batch-pending = 10 + +The number of batches that may be pending in memory. + +Environment variable: `INFLUXDB_UDP_0_BATCH_PENDING` + +### batch-timeout = "1s" + +The input will flush at least this often even if it hasn't reached the configured batch-size. + +Environment variable: `INFLUXDB_UDP_BATCH_TIMEOUT` + +### read-buffer = 0 + +UDP read buffer size, 0 means OS default. +UDP listener will fail if set above OS max. + +Environment variable: `INFLUXDB_UDP_BATCH_SIZE` + +### precision = "" + +[Time precision](/influxdb/v1.3/query_language/spec/#durations) used when decoding time values. Defaults to `nanoseconds` which is the default of the database. + +Environment variable: `INFLUXDB_UDP_PRECISION` + +## [continuous_queries] + +This section controls how [continuous queries (CQs)](/influxdb/v1.3/concepts/glossary/#continuous-query-cq) run within InfluxDB. +CQs are automated batches of queries that execute over recent time intervals. +InfluxDB executes one auto-generated query per `GROUP BY time()` interval. + +### enabled = true + +Set to `false` to disable CQs. + +Environment variable: `INFLUXDB_CONTINUOUS_QUERIES_ENABLED` + +### log-enabled = true + +Set to `false` to disable logging for CQ events. + +Environment variable: `INFLUXDB_CONTINUOUS_QUERIES_LOG_ENABLED` + +### query-stats-enabled = false + +When set to true, continuous query execution statistics are written to the default monitor store. + +Environment variable: `INFLUXDB_CONTINUOUS_QUERIES_QUERY_STATS_ENABLED` + +### run-interval = "1s" + +The interval at which InfluxDB checks to see if a CQ needs to run. Set this option to the lowest interval at which your CQs run. For example, if your most frequent CQ runs every minute, set `run-interval` to `1m`. + +Environment variable: `INFLUXDB_CONTINUOUS_QUERIES_RUN_INTERVAL` diff --git a/content/influxdb/v1.3/administration/differences.md b/content/influxdb/v1.3/administration/differences.md new file mode 100644 index 000000000..9ca17cd7c --- /dev/null +++ b/content/influxdb/v1.3/administration/differences.md @@ -0,0 +1,214 @@ +--- +title: Differences Between InfluxDB 1.3 and 1.2 +aliases: + - influxdb/v1.3/concepts/013_vs_1/ + - influxdb/v1.3/concepts/012_vs_013/ + - influxdb/v1.3/concepts/011_vs_012/ + - influxdb/v1.3/concepts/010_vs_011/ + - influxdb/v1.3/concepts/09_vs_010/ + - influxdb/v1.3/concepts/08_vs_09/ +menu: + influxdb_1_3: + weight: 40 + parent: Administration +--- + +This page aims to ease the transition from InfluxDB 1.2 to InfluxDB 1.3. +For a comprehensive list of the differences between the versions +see [InfluxDB's Changelog](/influxdb/v1.3/about_the_project/releasenotes-changelog/). + +### Content +* [TSI Release](#tsi-release) +* [Web Admin UI Removal](#web-admin-ui-removal) +* [Duration Unit Updates](#duration-unit-updates) +* [InfluxQL Updates](#influxql-updates) + * [Operators](#operators) + * [Functions](#functions) + * [Other](#other) + +## TSI Release +Version 1.3.0 marked the first official release of InfluxDB's new time series index (TSI) engine. + +The TSI engine is a significant technical advancement in InfluxDB. +It offers a solution to the [time-structured merge tree](https://docs.influxdata.com/influxdb/v1.3/concepts/storage_engine/) engine's [high series cardinality issue](https://docs.influxdata.com/influxdb/v1.3/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter). +With TSI, the number of series should be unbounded by the memory on the server hardware and the number of existing series will have a negligible impact on database startup time. +See Paul Dix's blogpost [Path to 1 Billion Time Series: InfluxDB High Cardinality Indexing Ready for Testing](https://www.influxdata.com/path-1-billion-time-series-influxdb-high-cardinality-indexing-ready-testing/) for additional information. + +TSI is disabled by default in version 1.3. +To enable TSI, uncomment the [`index-version` setting](/influxdb/v1.3/administration/config/#index-version-inmem) and set it to `tsi1`. +The `index-version` setting is in the `[data]` section of the configuration file. +Next, restart your InfluxDB instance. + +``` +[data] + dir = "/var/lib/influxdb/data" + index-version = "tsi1" +``` + +## Web Admin UI Removal + +In version 1.3, the web admin interface is no longer available in InfluxDB. +The interface does not run on port `8083` and InfluxDB ignores the `[admin]` section in the configuration file if that section is present. +[Chronograf](/chronograf/v1.3/) replaces the web admin interface with improved tooling for querying data, writing data, and database management. +See [Chronograf's transition guide](/chronograf/v1.3/guides/transition-web-admin-interface/) for more information. + +## Duration Unit Updates + +Duration units specify the time precision in InfluxQL queries and when writing data to InfluxDB. +Version 1.3 introduces two updates to duration units. + +InfluxDB now supports the nanosecond (`ns`) duration literal. +The query below uses a [`GROUP BY time()` clause](/influxdb/v1.3/query_language/data_exploration/#group-by-time-intervals) to group [averages](/influxdb/v1.3/query_language/functions/#mean) into `1000000000` nanosecond buckets: +``` +> SELECT MEAN("value") FROM "gopher" WHERE time >= 1497481480598711679 AND time <= 1497481484005926368 GROUP BY time(1000000000ns) +``` + +Version 1.3 also changes the way InfluxDB handles queries with an invalid duration unit. +In versions prior to 1.3, the system ignored invalid duration units and did not return an error. +In version 1.3, the system returns an error if the query includes an invalid duration unit. +The following query erroneously specifies oranges as a duration unit: + +``` +> SELECT MEAN("value") FROM "gopher" WHERE time >= 1497481480598711679 AND time <= 1497481484005926368 GROUP BY time(2oranges) +ERR: error parsing query: invalid duration +``` + +## InfluxQL Updates + +### Operators + +Version 1.3 introduces several new mathematical operators. +Follow the links below to learn more: + +* [Modulo (`%`)](/influxdb/v1.3/query_language/math_operators/#modulo) +* [Bitwise AND (`&`)](/influxdb/v1.3/query_language/math_operators/#bitwise-and) +* [Bitwise OR (`|`)](/influxdb/v1.3/query_language/math_operators/#bitwise-or) +* [Bitwise Exclusive-OR (`^`)](/influxdb/v1.3/query_language/math_operators/#bitwise-exclusive-or) + +### Functions + +InfluxDB version 1.3 introduces two new functions and updates the behavior for the existing `TOP()` and `BOTTOM()` functions. + +#### New function: `INTEGRAL()` + +The `INTEGRAL()` function returns the area under the curve for subsequent [field values](/influxdb/v1.3/concepts/glossary/#field-value). +The query below returns the area under the curve (in seconds) for the field values associated with the `water_level` field key and in the `h2o_feet` measurement: + +``` +> SELECT INTEGRAL("water_level") FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' + +name: h2o_feet +time integral +---- -------- +1970-01-01T00:00:00Z 3732.66 +``` + +See the [functions page](/influxdb/v1.3/query_language/functions/#integral) for detailed documentation. + +#### New function: `NON_NEGATIVE_DIFFERENCE()` + +The `NON_NEGATIVE_DIFFERENCE()` function returns the non-negative result of subtraction between subsequent [field values](/influxdb/v1.3/concepts/glossary/#field-value). +Non-negative results of subtraction include positive differences and differences that equal zero. +The query below returns the non-negative difference between subsequent field values in the `water_level` field key and in the `h2o_feet` measurement: + +``` +> SELECT NON_NEGATIVE_DIFFERENCE("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' AND "location" = 'santa_monica' + +name: h2o_feet +time non_negative_difference +---- ----------------------- +2015-08-18T00:06:00Z 0.052000000000000046 +2015-08-18T00:18:00Z 0.09799999999999986 +2015-08-18T00:30:00Z 0.010000000000000231 +``` + +See the [functions page](/influxdb/v1.3/query_language/functions/#non-negative-difference) for detailed documentation. + +#### Updated functions: `TOP()` and `BOTTOM()` + +Version 1.3 introduces three major changes to the `TOP()` and `BOTTOM()` functions: + +> * The `TOP()` and `BOTTOM()` functions no longer support other functions in +the `SELECT` [clause](/influxdb/v1.3/query_language/data_exploration/#description-of-syntax). +The following query now returns an error: + + + > SELECT TOP(value,1),MEAN(value) FROM "gopher" + ERR: error parsing query: selector function top() cannot be combined with other functions + + +> * The `TOP()` and `BOTTOM()` functions now maintain `tags` as `tags` if the query includes a +[tag key](/influxdb/v1.3/concepts/glossary/#tag-key) as an argument. +The [query below](/influxdb/v1.3/query_language/functions/#issue-3-bottom-tags-and-the-into-clause) +preserves `location` as a tag in the newly-written data: + + + > SELECT BOTTOM("water_level","location",2) INTO "bottom_water_levels" FROM "h2o_feet" + name: result + time written + ---- ------- + 1970-01-01T00:00:00Z 2 + + > SHOW TAG KEYS FROM "bottom_water_levels" + name: bottom_water_levels + tagKey + ------ + location + + +> * The `TOP()` and `BOTTOM()` functions now preserve the timestamps in the original data when they're +used with the [`GROUP BY time()` clause](/influxdb/v1.3/query_language/data_exploration/#group-by-time-intervals). +The [following query](/influxdb/v1.3/query_language/functions/#issue-1-top-with-a-group-by-time-clause) returns +the points' original timestamps; the timestamps are not forced to match the start of the `GROUP BY time()` intervals: + + + > SELECT TOP("water_level",2) FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' AND "location" = 'santa_monica' GROUP BY time(18m) + + name: h2o_feet + time top + ---- ------ + __ + 2015-08-18T00:00:00Z 2.064 | + 2015-08-18T00:06:00Z 2.116 | <------- Greatest points for the first time interval + __ + __ + 2015-08-18T00:18:00Z 2.126 | + 2015-08-18T00:30:00Z 2.051 | <------- Greatest points for the second time interval + __ + + +Review the functions page for a complete discussion of the [`TOP()` function](/influxdb/v1.3/query_language/functions/#top) and the [`BOTTOM()` function](/influxdb/v1.3/query_language/functions/#bottom). + +### Other + +#### Time zone clause +InfluxQL's new time zone clause returns the UTC offset for the specified timezone. +The query below returns the UTC offset for Chicago’s time zone: +``` + > SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:18:00Z' tz('America/Chicago') + + name: h2o_feet + time water_level + ---- ----------- + 2015-08-17T19:00:00-05:00 2.064 + 2015-08-17T19:06:00-05:00 2.116 + 2015-08-17T19:12:00-05:00 2.028 + 2015-08-17T19:18:00-05:00 2.126 +``` +See the [data exploration page](/influxdb/v1.3/query_language/data_exploration/#the-time-zone-clause) for more information. + +#### Continuous Queries + +A defect was identified in the way that continuous queries were previously handling time ranges. The result of that +defect is that for certain time scales larger than 1d, the continuous queries had their time ranges miscalculated and +were run at the incorrect time. + +This has been addressed -- but this change may impact existing continuous queries which process data in time +ranges larger than 1d. +Additional details [can be found here](https://github.com/influxdata/influxdb/issues/8569). + + +#### CLI non-admin user updates +In versions prior to v1.3, [non-admin users](/influxdb/v1.3/query_language/authentication_and_authorization/#user-types-and-privileges) could not execute a `USE ` query in the [CLI](/influxdb/v1.3/tools/shell/) even if they had `READ` and/or `WRITE` permissions on that database. +Starting with version 1.3, non-admin users can execute the `USE ` query for databases on which they have `READ` and/or `WRITE` permissions. +See the [FAQ page](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-can-a-non-admin-user-use-a-database-in-influxdb-s-cli) for more information. diff --git a/content/influxdb/v1.3/administration/https_setup.md b/content/influxdb/v1.3/administration/https_setup.md new file mode 100644 index 000000000..f44f836c4 --- /dev/null +++ b/content/influxdb/v1.3/administration/https_setup.md @@ -0,0 +1,216 @@ +--- +title: HTTPS Setup +menu: + influxdb_1_3: + weight: 100 + parent: Administration +--- + +This guide describes how to enable HTTPS with InfluxDB. +Setting up HTTPS secures the communication between clients and the InfluxDB +server, +and, in some cases, HTTPS verifies the authenticity of the InfluxDB server to +clients. + +If you plan on sending requests to InfluxDB over a network, we +[strongly recommend](/influxdb/v1.3/administration/security/) +that you set up HTTPS. + +## Requirements + +To set up HTTPS with InfluxDB, you'll need an existing or new InfluxDB instance +and a Transport Layer Security (TLS) certificate (also known as a +Secured Sockets Layer (SSL) certificate). +InfluxDB supports three types of TLS/SSL certificates: + +* **Single domain certificates signed by a [Certificate Authority](https://en.wikipedia.org/wiki/Certificate_authority)** + + These certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server. + With this certificate option, every InfluxDB instance requires a unique single domain certificate. + +* **Wildcard certificates signed by a Certificate Authority** + + These certificates provide cryptographic security to HTTPS requests and allow clients to verify the identity of the InfluxDB server. + Wildcard certificates can be used across multiple InfluxDB instances on different servers. + +* **Self-signed certificates** + + Self-signed certificates are not signed by a CA and you can [generate](#step-1-generate-a-self-signed-certificate) them on your own machine. + Unlike CA-signed certificates, self-signed certificates only provide cryptographic security to HTTPS requests. + They do not allow clients to verify the identity of the InfluxDB server. + We recommend using a self-signed certificate if you are unable to obtain a CA-signed certificate. + With this certificate option, every InfluxDB instance requires a unique self-signed certificate. + +Regardless of your certificate's type, InfluxDB supports certificates composed of +a private key file (`.key`) and a signed certificate file (`.crt`) file pair, as well as certificates +that combine the private key file and the signed certificate file into a single bundled file (`.pem`). + +The following two sections outline how to set up HTTPS with InfluxDB [using a CA-signed +certificate](#setup-https-with-a-ca-signed-certificate) and [using a self-signed certificate](#setup-https-with-a-self-signed-certificate) +on Ubuntu 16.04. +Specific steps may be different for other operating systems. + +## Setup HTTPS with a CA-Signed Certificate + +#### Step 1: Install the SSL/TLS certificate + +Place the private key file (`.key`) and the signed certificate file (`.crt`) +or the single bundled file (`.pem`) in the `/etc/ssl` directory. + +#### Step 2: Ensure file permissions +Certificate files require read and write access by the `root` user. +Ensure that you have the correct file permissions by running the following +commands: + +``` +sudo chown root:root /etc/ssl/ +sudo chmod 644 /etc/ssl/ +sudo chmod 600 /etc/ssl/ +``` + +#### Step 3: Enable HTTPS in InfluxDB's configuration file + +HTTPS is disabled by default. +Enable HTTPS in InfluxDB's the `[http]` section of the configuration file (`/etc/influxdb/influxdb.conf`) by setting: + +* `https-enabled` to `true` +* `http-certificate` to `/etc/ssl/.crt` (or to `/etc/ssl/.pem`) +* `http-private-key` to `/etc/ssl/.key` (or to `/etc/ssl/.pem`) + +``` +[http] + + [...] + + # Determines whether HTTPS is enabled. + https-enabled = true + + [...] + + # The SSL certificate to use when HTTPS is enabled. + https-certificate = ".pem" + + # Use a separate private key location. + https-private-key = ".pem" +``` + +#### Step 4: Restart InfluxDB + +Restart the InfluxDB process for the configuration changes to take effect: +``` +sudo systemctl restart influxdb +``` + +#### Step 5: Verify the HTTPS Setup + +Verify that HTTPS is working by connecting to InfluxDB with the [CLI tool](/influxdb/v1.3/tools/shell/): +``` +influx -ssl -host .com +``` + +A successful connection returns the following: +``` +Connected to https://.com:8086 version 1.x.x +InfluxDB shell version: 1.x.x +> +``` + +That's it! You've successfully set up HTTPS with InfluxDB. + +## Setup HTTPS with a Self-Signed Certificate + +#### Step 1: Generate a self-signed certificate + +The following command generates a private key file (`.key`) and a self-signed +certificate file (`.crt`) which remain valid for the specified `NUMBER_OF_DAYS`. +It outputs those files to InfluxDB's default certificate file paths and gives them +the required permissions. + +``` +sudo openssl req -x509 -nodes -newkey rsa:2048 -keyout /etc/ssl/influxdb-selfsigned.key -out /etc/ssl/influxdb-selfsigned.crt -days +``` + +When you execute the command, it will prompt you for more information. +You can choose to fill out that information or leave it blank; +both actions generate valid certificate files. + +#### Step 2: Enable HTTPS in InfluxDB's configuration file + +HTTPS is disabled by default. +Enable HTTPS in InfluxDB's the `[http]` section of the configuration file (`/etc/influxdb/influxdb.conf`) by setting: + +* `https-enabled` to `true` +* `http-certificate` to `/etc/ssl/influxdb-selfsigned.crt` +* `http-private-key` to `/etc/ssl/influxdb-selfsigned.key` + +``` +[http] + + [...] + + # Determines whether HTTPS is enabled. + https-enabled = true + + [...] + + # The SSL certificate to use when HTTPS is enabled. + https-certificate = "/etc/ssl/influxdb-selfsigned.crt" + + # Use a separate private key location. + https-private-key = "/etc/ssl/influxdb-selfsigned.key" +``` + +#### Step 3: Restart InfluxDB + +Restart the InfluxDB process for the configuration changes to take effect: +``` +sudo systemctl restart influxdb +``` + +#### Step 4: Verify the HTTPS Setup + +Verify that HTTPS is working by connecting to InfluxDB with the [CLI tool](/influxdb/v1.3/tools/shell/): +``` +influx -ssl -unsafeSsl -host .com +``` + +A successful connection returns the following: +``` +Connected to https://.com:8086 version 1.x.x +InfluxDB shell version: 1.x.x +> +``` + +That's it! You've successfully set up HTTPS with InfluxDB. + +> +## Connect Telegraf to a secured InfluxDB instance +> +Connecting [Telegraf](/telegraf/v1.3/) to an InfluxDB instance that's using +HTTPS requires some additional steps. +> +In Telegraf's configuration file (`/etc/telegraf/telegraf.conf`), edit the `urls` +setting to indicate `https` instead of `http` and change `localhost` to the +relevant domain name. +If you're using a self-signed certificate, uncomment the `insecure_skip_verify` +setting and set it to `true`. +> + ############################################################################### + # OUTPUT PLUGINS # + ############################################################################### +> + # Configuration for influxdb server to send metrics to + [[outputs.influxdb]] + ## The full HTTP or UDP endpoint URL for your InfluxDB instance. + ## Multiple urls can be specified as part of the same cluster, + ## this means that only ONE of the urls will be written to each interval. + # urls = ["udp://localhost:8089"] # UDP endpoint example + urls = ["https://.com:8086"] +> + [...] +> + ## Optional SSL Config + [...] + insecure_skip_verify = true # <-- Update only if you're using a self-signed certificate +> +Next, restart Telegraf and you're all set! diff --git a/content/influxdb/v1.3/administration/logs.md b/content/influxdb/v1.3/administration/logs.md new file mode 100644 index 000000000..5faac140e --- /dev/null +++ b/content/influxdb/v1.3/administration/logs.md @@ -0,0 +1,65 @@ +--- +title: Logs + +menu: + influxdb_1_3: + weight: 10 + parent: Administration +--- + +InfluxDB writes log output, by default, to `stderr`. +Depending on your use case, this log information can be written to another location. + +## Running InfluxDB directly + +If you run InfluxDB directly, using `influxd`, all logs will be written to `stderr`. +You may redirect this log output as you would any output to `stderr` like so: + +```bash +influxd 2>$HOME/my_log_file +``` + +## Launched as a service + +### sysvinit + +If InfluxDB was installed using a pre-built package, and then launched +as a service, `stderr` is redirected to +`/var/log/influxdb/influxd.log`, and all log data will be written to +that file. You can override this location by setting the variable +`STDERR` in the file `/etc/default/influxdb`. + +>**Note:** On macOS the logs, by default, are stored in the file `/usr/local/var/log/influxdb.log` + +For example, if `/etc/default/influxdb` contains: + +```bash +STDERR=/dev/null +``` + +all log data will be discarded. You can similarly direct output to +`stdout` by setting `STDOUT` in the same file. Output to `stdout` is +sent to `/dev/null` by default when InfluxDB is launched as a service. + +InfluxDB must be restarted to pick up any changes to `/etc/default/influxdb`. + +### systemd + +Starting with version 1.0, InfluxDB on systemd systems will no longer +write files to `/var/log/influxdb` by default, and will now use the +system configured default for logging (usually journald). On most +systems, the logs will be directed to the systemd journal and can be +accessed with the command: + +``` +sudo journalctl -u influxdb.service +``` + +Please consult the systemd journald documentation for configuring +journald. + +## Using logrotate + +You can use [logrotate](http://manpages.ubuntu.com/manpages/cosmic/en/man8/logrotate.8.html) to rotate the log files generated by InfluxDB on systems where logs are written to flat files. +If using the package install on a sysvinit system, the config file for logrotate is installed in `/etc/logrotate.d`. +You can view the file [here](https://github.com/influxdb/influxdb/blob/1.3/scripts/logrotate). diff --git a/content/influxdb/v1.3/administration/ports.md b/content/influxdb/v1.3/administration/ports.md new file mode 100644 index 000000000..5375a9e60 --- /dev/null +++ b/content/influxdb/v1.3/administration/ports.md @@ -0,0 +1,58 @@ +--- +title: Ports + +menu: + influxdb_1_3: + weight: 20 + parent: Administration +--- + +## Enabled Ports + +### `8086` +The default port that runs the InfluxDB HTTP service. +[Configure this port](/influxdb/v1.3/administration/config/#bind-address-8086) +in the configuration file. + +**Resources** [API Reference](/influxdb/v1.3/tools/api/) + +### 8088 +The default port that runs the RPC service for backup and restore. +[Configure this port](/influxdb/v1.3/administration/config/#bind-address-127-0-0-1-8088) +in the configuration file. + +**Resources** [Backup and Restore](/influxdb/v1.3/administration/backup_and_restore/) + +## Disabled Ports + +### 2003 + +The default port that runs the Graphite service. +[Enable and configure this port](/influxdb/v1.3/administration/config/#bind-address-2003) +in the configuration file. + +**Resources** [Graphite README](https://github.com/influxdata/influxdb/blob/master/services/graphite/README.md) + +### 4242 + +The default port that runs the OpenTSDB service. +[Enable and configure this port](/influxdb/v1.3/administration/config/#bind-address-4242) +in the configuration file. + +**Resources** [OpenTSDB README](https://github.com/influxdata/influxdb/blob/master/services/opentsdb/README.md) + +### 8089 + +The default port that runs the UDP service. +[Enable and configure this port](/influxdb/v1.3/administration/config/#bind-address-8089) +in the configuration file. + +**Resources** [UDP README](https://github.com/influxdata/influxdb/blob/master/services/udp/README.md) + +### 25826 + +The default port that runs the Collectd service. +[Enable and configure this port](/influxdb/v1.3/administration/config/#bind-address-25826) +in the configuration file. + +**Resources** [Collectd README](https://github.com/influxdata/influxdb/blob/master/services/collectd/README.md) diff --git a/content/influxdb/v1.3/administration/previous_differences.md b/content/influxdb/v1.3/administration/previous_differences.md new file mode 100644 index 000000000..d142b488c --- /dev/null +++ b/content/influxdb/v1.3/administration/previous_differences.md @@ -0,0 +1,41 @@ +--- +title: Differences between InfluxDB 1.3 and versions prior to 1.2 +aliases: + - influxdb/v1.3/concepts/012_vs_previous/ + - influxdb/v1.3/concepts/011_vs_previous/ + - influxdb/v1.3/concepts/010_vs_previous/ + - influxdb/v1.3/concepts/013_vs_previous/ + - influxdb/v1.3/concepts/1_vs_previous/ +menu: + influxdb_1_3: + weight: 50 + parent: Administration +--- + +If you're using version 1.2, please see [Differences Between InfluxDB 1.3 and 1.2](/influxdb/v1.3/administration/differences/). + +Users looking to upgrade to InfluxDB 1.3 from versions prior to 1.2 should view the following pages in our documentation. + +##### 1.1 users: +[Differences Between InfluxDB 1.2 and 1.1](/influxdb/v1.2/administration/differences/) + +##### 1.0 users: +[Differences Between InfluxDB 1.1 and 1.0](/influxdb/v1.1/administration/differences/) + +##### 0.13 users: +[Differences Between InfluxDB 1.0 and 0.13](/influxdb/v1.0/administration/013_vs_1/) + +##### 0.12 users: +[Differences Between InfluxDB 0.13 and 0.12](/influxdb/v0.13/administration/012_vs_013/) + +##### 0.11 users: +[Differences between InfluxDB 0.12 and InfluxDB 0.11](/influxdb/v0.12/concepts/011_vs_012/) + +##### 0.10 users: +[Differences between InfluxDB 0.11 and InfluxDB 0.10](/influxdb/v1.3/concepts/010_vs_011/) + +##### 0.9 users: +[Differences between InfluxDB 0.9 and InfluxDB 0.10](/influxdb/v0.10/concepts/09_vs_010/) + +##### 0.8 users: +[Differences between InfluxDB 0.8 and InfluxDB 0.10](/influxdb/v0.10/concepts/08_vs_010/) diff --git a/content/influxdb/v1.3/administration/security.md b/content/influxdb/v1.3/administration/security.md new file mode 100644 index 000000000..c498f74d3 --- /dev/null +++ b/content/influxdb/v1.3/administration/security.md @@ -0,0 +1,50 @@ +--- +title: Security Best Practices +menu: + influxdb_1_3: + weight: 80 + parent: Administration +--- + +Some customers may choose to install InfluxDB with public internet access, however +doing so can inadvertently expose your data and invite unwelcome attacks on your database. +Check out the sections below for how protect the data in your InfluxDB instance. + +## Enable Authentication + +Password protect your InfluxDB instance to keep any unauthorized individuals +from accessing your data. + +Resources: +[Set up Authentication](/influxdb/v1.3/query_language/authentication_and_authorization/#set-up-authentication) + +## Manage Users and their Permissions + +Restrict access by creating individual users and assigning them relevant +read and/or write permissions. + +Resources: +[User Types and Privileges](/influxdb/v1.3/query_language/authentication_and_authorization/#user-types-and-privileges), +[User Management Commands](/influxdb/v1.3/query_language/authentication_and_authorization/#user-management-commands) + +## Set up HTTPS + +Using HTTPS secures the communication between clients and the InfluxDB server, and, in +some cases, HTTPS verifies the authenticity of the InfluxDB server to clients (bi-directional authentication). + +Resources: +[HTTPS Setup](/influxdb/v1.3/administration/https_setup/) + +## Secure your Host + +### Ports +If you're only running InfluxDB, close all ports on the host except for port `8086`. +You can also use a proxy to port `8086`. + +InfluxDB uses port `8088` for remote [backups and restores](/influxdb/v1.3/administration/backup_and_restore/). +We highly recommend closing that port and, if performing a remote backup, +giving specific permission only to the remote machine. + +### AWS Recommendations + +We recommend implementing on-disk encryption; InfluxDB does not offer built-in support to encrypt the data. diff --git a/content/influxdb/v1.3/administration/stability_and_compatibility.md b/content/influxdb/v1.3/administration/stability_and_compatibility.md new file mode 100644 index 000000000..446c56b74 --- /dev/null +++ b/content/influxdb/v1.3/administration/stability_and_compatibility.md @@ -0,0 +1,28 @@ +--- +title: Stability and Compatibility +menu: + influxdb_1_3: + weight: 80 + parent: Administration +--- + +## 1.x API compatibility and stability + +One of the more important aspects of the 1.0 release is that this marks the stabilization of our API and storage format. Over the course of the last three years we’ve iterated aggressively, often breaking the API in the process. With the release of 1.0 and for the entire 1.x line of releases we’re committing to the following: + +### No breaking HTTP API changes + +When it comes to the HTTP API, if a command works in 1.0 it will work unchanged in all 1.x releases...with one caveat. We will be adding [keywords](/influxdb/v1.3/query_language/spec/#keywords) to the query language. New keywords won't break your queries if you wrap all [identifiers](/influxdb/v1.3/concepts/glossary/#identifier) in double quotes and all string literals in single quotes. This is generally considered best practice so it should be followed anyway. For users following that guideline, the query and ingestion APIs will have no breaking changes for all 1.x releases. Note that this does not include the Go code in the project. The underlying Go API in InfluxDB can and will change over the course of 1.x development. Users should be accessing InfluxDB through the [HTTP API](/influxdb/v1.3/tools/api/). + +### Storage engine stability + +The [TSM](/influxdb/v1.3/concepts/glossary/#tsm-time-structured-merge-tree) storage engine file format is now at version 1. While we may introduce new versions of the format in the 1.x releases, these new versions will run side-by-side with previous versions. What this means for users is there will be no lengthy migrations when upgrading from one 1.x release to another. + +### Additive changes + +The query engine will have additive changes over the course of the new releases. We’ll introduce new query functions and new functionality into the language without breaking backwards compatibility. We may introduce new protocol endpoints (like a binary format) and versions of the line protocol and query API to improve performance and/or functionality, but they will have to run in parallel with the existing versions. Existing versions will be supported for the entirety of the 1.x release line. + + +### Ongoing support + +We’ll continue to fix bugs on the 1.x versions of the [line protocol](/influxdb/v1.3/concepts/glossary/#line-protocol), query API, and TSM storage format. Users should expect to upgrade to the latest 1.x.x release for bug fixes, but those releases will all be compatible with the 1.0 API and won’t require data migrations. For instance, if a user is running 1.1 and there are bug fixes released in 1.2, they should upgrade to the 1.2 release. Until 1.3 is released, patch fixes will go into 1.2.x. Because all future 1.x releases are drop in replacements for previous 1.x releases, users should upgrade to the latest in the 1.x line to get all bug fixes. diff --git a/content/influxdb/v1.3/administration/upgrading.md b/content/influxdb/v1.3/administration/upgrading.md new file mode 100644 index 000000000..242b8452d --- /dev/null +++ b/content/influxdb/v1.3/administration/upgrading.md @@ -0,0 +1,155 @@ +--- +title: Upgrading from previous versions +menu: + influxdb_1_3: + weight: 60 + parent: Administration +--- + + +This page outlines process for upgrading from: + + + + + + +
Version 0.12-1.2 to 1.3Version 0.10 or 0.11 to 1.3
+ +## Upgrade from 0.12-1.2 to 1.3 + +1. [Download](https://influxdata.com/downloads/#influxdb) InfluxDB version +1.3 + +2. Update the configuration file + + Migrate any customizations in the 1.2 configuration file to the [1.3 configuration file](/influxdb/v1.3/administration/config/). + +3. Restart the process + +4. Check out the new features outlined in +[Differences between InfluxDB 1.3 and 1.2](/influxdb/v1.3/administration/differences/) + +## Upgrade from 0.10 or 0.11 to 1.3 + +> **Note:** 0.10 users will need to +[convert](/influxdb/v0.10/administration/upgrading/#convert-b1-and-bz1-shards-to-tsm1) +any remaining `b1` and `bz1` shards to `TSM` format before following the +instructions below. +InfluxDB 1.3 cannot read non-`TSM` shards. +Check for non-`TSM` shards in your data directory: +> +* Non-`TSM` shards are files of the form: `data///`` +* `TSM` shards are files of the form: `data////.tsm` + +In versions prior to 0.12, InfluxDB stores +[metastore](/influxdb/v1.3/concepts/glossary/#metastore) information in +`raft.db` via the raft services. +In versions 0.12+, InfluxDB stores metastore information in `meta.db`, a binary +protobuf file. + +The following steps outline how to transfer metastore information to the new +format. +They also outline when to upgrade the binary to 1.3 and when to generate a +new configuration file. + +To start out, you must be working with version 0.10 or 0.11 (don't upgrade the +`influxd` binary yet!). +If you've already upgraded the binary to 1.3, [reinstall 0.11.1](/influxdb/v0.12/administration/upgrading/#urls-for-influxdb-0-11); +InfluxDB 1.3 will yield an error +(`run: create server: detected /var/lib/influxdb/meta/raft.db. [...]`) if you +attempt to start the process without completing the steps below. +The examples below assume you are working with a version of linux. + +> Before you start, we recommend making a copy of the entire 0.10 or 0.11 `meta` +directory in case you experience problems with the upgrade. The upgrade process +removes the `raft.db` and `node.json` files from the `meta` directory: +> +``` +cp -r +``` +> +Example: +> +Create a copy of the 0.10 or 0.11 `meta` directory in `backups/`: +``` +~# cp -r /var/lib/influxdb/meta backups/ +``` + +**1.** While still running 0.10 or 0.11, export the metastore data to a different +directory: + +``` +influxd backup +``` + +The directory will be created if it doesn't already exist. + +Example: + +Export the 0.10 or 0.11 metastore to `/tmp/backup`: +``` +~# influxd backup /tmp/backup/ +2016/04/01 15:33:35 backing up metastore to /tmp/backup/meta.00 +2016/04/01 15:33:35 backup complete +``` + +**2.** Stop the `influxdb` service: + +``` +sudo service influxdb stop +``` + +**3.** [Upgrade](https://influxdata.com/downloads/#influxdb) the `influxd` +binary to 1.3. but do not start the service. + +**4.** Upgrade your metastore to the 1.3 store by performing a `restore` with +the backup you created in step 1: + +``` +influxd restore -metadir= +``` + +Example: + +Restore `/tmp/backup` to the meta directory in `/var/lib/influxdb/meta`: +``` +~# influxd restore -metadir=/var/lib/influxdb/meta /tmp/backup +Using metastore snapshot: /tmp/backup/meta.00 +``` + +**5.** Update the permissions on the meta database: + +``` +chown influxdb:influxdb /meta.db +``` + +Example: + +``` +~# chown influxdb:influxdb /var/lib/influxdb/meta/meta.db +``` + +**6.** Update the configuration file: + +Compare your old configuration file against the [1.3 configuration file](/influxdb/v1.3/administration/config/) +and manually update any defaults with your localized settings. + +**7.** Start the 1.3 service: + +``` +sudo service influxdb start +``` + +**8.** Confirm that your metastore data is present: + +The 1.3 output from the queries `SHOW DATABASES`,`SHOW USERS` and +`SHOW RETENTION POLICIES ON ` should match the 0.10 or 0.11 +output. + +If your metastore data do not appear to be present, stop the service, reinstall +InfluxDB 0.10 or 0.11, restore the copy you made of the entire 0.10 or 0.11 `meta` directory to +the `meta` directory, and try working through these steps again. + +**9.** Check out the new features outlined in +[Differences between InfluxDB 1.3 and 1.2](/influxdb/v1.3/administration/differences/). diff --git a/content/influxdb/v1.3/concepts/_index.md b/content/influxdb/v1.3/concepts/_index.md new file mode 100644 index 000000000..9fa64063d --- /dev/null +++ b/content/influxdb/v1.3/concepts/_index.md @@ -0,0 +1,31 @@ +--- +title: InfluxDB concepts +menu: + influxdb_1_3: + name: Concepts + weight: 30 +--- + +Understanding the following concepts will help you get the most out of InfluxDB. + +## [Key Concepts](/influxdb/v1.3/concepts/key_concepts/) + +A brief explanation of InfluxDB's core architecture useful for new beginners. + +## [Glossary of Terms](/influxdb/v1.3/concepts/glossary/) + +A list of InfluxDB terms and their definitions. + +## [Comparison to SQL](/influxdb/v1.3/concepts/crosswalk/) + +## [Design Insights and Tradeoffs](/influxdb/v1.3/concepts/insights_tradeoffs/) + +A brief treatment of some of the performance tradeoffs made during the design phase of InfluxDB + +## [Schema and Data Layout](/influxdb/v1.3/concepts/schema_and_data_layout/) + +A useful overview of the InfluxDB time series data structure and how it affects performance. + +## [Storage Engine](/influxdb/v1.3/concepts/storage_engine/) + +A overview of how InfluxDB to stores data on disk. diff --git a/content/influxdb/v1.3/concepts/crosswalk.md b/content/influxdb/v1.3/concepts/crosswalk.md new file mode 100644 index 000000000..fe49c6329 --- /dev/null +++ b/content/influxdb/v1.3/concepts/crosswalk.md @@ -0,0 +1,196 @@ +--- +title: Comparison to SQL +menu: + influxdb_1_3: + weight: 20 + parent: Concepts +--- + +# What's in a database? + +This page gives SQL users an overview of how InfluxDB is like an SQL database and how it's not. +It highlights some of the major distinctions between the two and provides a loose crosswalk between the different database terminologies and query languages. + +## In general... + +InfluxDB is designed to work with time-series data. +SQL databases can handle time-series but weren't created strictly for that purpose. +In short, InfluxDB is made to store a large volume of time-series data and perform real-time analysis on those data, quickly. + +### Timing is everything + +In InfluxDB, a timestamp identifies a single point in any given data series. +This is like an SQL database table where the primary key is pre-set by the system and is always time. + +InfluxDB also recognizes that your [schema](/influxdb/v1.3/concepts/glossary/#schema) preferences may change over time. +In InfluxDB you don't have to define schemas up front. +Data points can have one of the fields on a measurement, all of the fields on a measurement, or any number in-between. +You can add new fields to a measurement simply by writing a point for that new field. +If you need an explanation of the terms measurements, tags, and fields check out the next section for an SQL database to InfluxDB terminology crosswalk. + +## Terminology + +The table below is a (very) simple example of a table called `foodships` in an SQL database +with the unindexed column `#_foodships` and the indexed columns `park_id`, `planet`, and `time`. + +``` sql ++---------+---------+---------------------+--------------+ +| park_id | planet | time | #_foodships | ++---------+---------+---------------------+--------------+ +| 1 | Earth | 1429185600000000000 | 0 | +| 1 | Earth | 1429185601000000000 | 3 | +| 1 | Earth | 1429185602000000000 | 15 | +| 1 | Earth | 1429185603000000000 | 15 | +| 2 | Saturn | 1429185600000000000 | 5 | +| 2 | Saturn | 1429185601000000000 | 9 | +| 2 | Saturn | 1429185602000000000 | 10 | +| 2 | Saturn | 1429185603000000000 | 14 | +| 3 | Jupiter | 1429185600000000000 | 20 | +| 3 | Jupiter | 1429185601000000000 | 21 | +| 3 | Jupiter | 1429185602000000000 | 21 | +| 3 | Jupiter | 1429185603000000000 | 20 | +| 4 | Saturn | 1429185600000000000 | 5 | +| 4 | Saturn | 1429185601000000000 | 5 | +| 4 | Saturn | 1429185602000000000 | 6 | +| 4 | Saturn | 1429185603000000000 | 5 | ++---------+---------+---------------------+--------------+ +``` + +Those same data look like this in InfluxDB: + +```sql +name: foodships +tags: park_id=1, planet=Earth +time #_foodships +---- ------------ +2015-04-16T12:00:00Z 0 +2015-04-16T12:00:01Z 3 +2015-04-16T12:00:02Z 15 +2015-04-16T12:00:03Z 15 + +name: foodships +tags: park_id=2, planet=Saturn +time #_foodships +---- ------------ +2015-04-16T12:00:00Z 5 +2015-04-16T12:00:01Z 9 +2015-04-16T12:00:02Z 10 +2015-04-16T12:00:03Z 14 + +name: foodships +tags: park_id=3, planet=Jupiter +time #_foodships +---- ------------ +2015-04-16T12:00:00Z 20 +2015-04-16T12:00:01Z 21 +2015-04-16T12:00:02Z 21 +2015-04-16T12:00:03Z 20 + +name: foodships +tags: park_id=4, planet=Saturn +time #_foodships +---- ------------ +2015-04-16T12:00:00Z 5 +2015-04-16T12:00:01Z 5 +2015-04-16T12:00:02Z 6 +2015-04-16T12:00:03Z 5 +``` + +Referencing the example above, in general: + +* An InfluxDB measurement (`foodships`) is similar to an SQL database table. +* InfluxDB tags ( `park_id` and `planet`) are like indexed columns in an SQL database. +* InfluxDB fields (`#_foodships`) are like unindexed columns in an SQL database. +* InfluxDB points (for example, `2015-04-16T12:00:00Z 5`) are similar to SQL rows. + +Building on this comparison of database terminology, +InfluxDB's [continuous queries](/influxdb/v1.3/concepts/glossary/#continuous-query-cq) +and [retention policies](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) are +similar to stored procedures in an SQL database. +They're specified once and then performed regularly and automatically. + +Of course, there are some major disparities between SQL databases and InfluxDB. +SQL `JOIN`s aren't available for InfluxDB measurements; your schema design should reflect that difference. +And, as we mentioned above, a measurement is like an SQL table where the primary index is always pre-set to time. +InfluxDB timestamps must be in UNIX epoch (GMT) or formatted as a date-time string valid under RFC3339. + +For more detailed descriptions of the InfluxDB terms mentioned in this section see our [Glossary of Terms](/influxdb/v1.3/concepts/glossary/). + +## InfluxQL and SQL + +InfluxQL is an SQL-like query language for interacting with InfluxDB. +It has been lovingly crafted to feel familiar to those coming from other +SQL or SQL-like environments while also providing features specific +to storing and analyzing time series data. + +InfluxQL's `SELECT` statement follows the form of an SQL `SELECT` statement: + +```sql +SELECT FROM WHERE +``` +where `WHERE` is optional. +To get the InfluxDB output in the section above, you'd enter: + +```sql +SELECT * FROM "foodships" +``` + +If you only wanted to see data for the planet `Saturn`, you'd enter: + +```sql +SELECT * FROM "foodships" WHERE "planet" = 'Saturn' +``` + +If you wanted to see data for the planet `Saturn` after 12:00:01 UTC on April 16, 2015, you'd enter: + +```sql +SELECT * FROM "foodships" WHERE "planet" = 'Saturn' AND time > '2015-04-16 12:00:01' +``` + +As shown in the example above, InfluxQL allows you to specify the time range of your query in the `WHERE` clause. +You can use date-time strings wrapped in single quotes that have the +format `YYYY-MM-DD HH:MM:SS.mmm` +( `mmm` is milliseconds and is optional, and you can also specify microseconds or nanoseconds). +You can also use relative time with `now()` which refers to the server's current timestamp: + +```sql +SELECT * FROM "foodships" WHERE time > now() - 1h +``` + +That query outputs the data in the `foodships` measure where the timestamp is newer than the server's current time minus one hour. +The options for specifying time durations with `now()` are: + +|Letter|Meaning| +|:---:|:---:| +| ns | nanoseconds | +|u or µ|microseconds| +| ms | milliseconds | +|s | seconds | +| m | minutes | +| h | hours | +| d | days | +| w | weeks | + +
+ +InfluxQL also supports regular expressions, arithmetic in expressions, `SHOW` statements, and `GROUP BY` statements. +See our [data exploration](/influxdb/v1.3/query_language/data_exploration/) page for an in-depth discussion of those topics. +InfluxQL functions include `COUNT`, `MIN`, `MAX`, `MEDIAN`, `DERIVATIVE` and more. +For a full list check out the [functions](/influxdb/v1.3/query_language/functions/) page. + +Now that you have the general idea, check out our [Getting Started Guide](/influxdb/v1.3/introduction/getting_started/). + +## A note on why InfluxDB isn't CRUD... + +InfluxDB is a database that has been optimized for time series data. +This data commonly comes from sources like distributed sensor groups, click data from large websites, or lists of financial transactions. + +One thing this data has in common is that it is more useful in the aggregate. +One reading saying that your computer’s CPU is at 12% utilization at 12:38:35 UTC on a Tuesday is hard to draw conclusions from. +It becomes more useful when combined with the rest of the series and visualized. +This is where trends over time begin to show, and actionable insight can be drawn from the data. +In addition, time series data is generally written once and rarely updated. + +The result is that InfluxDB is not a full CRUD database but more like a CR-ud, +prioritizing the performance of creating and reading data over update and destroy, +and preventing some update and destroy behaviors to make create and read more performant. diff --git a/content/influxdb/v1.3/concepts/glossary.md b/content/influxdb/v1.3/concepts/glossary.md new file mode 100644 index 000000000..c849ab977 --- /dev/null +++ b/content/influxdb/v1.3/concepts/glossary.md @@ -0,0 +1,337 @@ +--- +title: Glossary of Terms +menu: + influxdb_1_3: + weight: 10 + parent: Concepts +--- + +## aggregation +An InfluxQL function that returns an aggregated value across a set of points. +See [InfluxQL Functions](/influxdb/v1.3/query_language/functions/#aggregations) for a complete list of the available and upcoming aggregations. + +Related entries: [function](/influxdb/v1.3/concepts/glossary/#function), [selector](/influxdb/v1.3/concepts/glossary/#selector), [transformation](/influxdb/v1.3/concepts/glossary/#transformation) + +## batch +A collection of points in line protocol format, separated by newlines (`0x0A`). +A batch of points may be submitted to the database using a single HTTP request to the write endpoint. +This makes writes via the HTTP API much more performant by drastically reducing the HTTP overhead. +InfluxData recommends batch sizes of 5,000-10,000 points, although different use cases may be better served by significantly smaller or larger batches. + +Related entries: [line protocol](/influxdb/v1.3/concepts/glossary/#line-protocol), [point](/influxdb/v1.3/concepts/glossary/#point) + +## continuous query (CQ) +An InfluxQL query that runs automatically and periodically within a database. +Continuous queries require a function in the `SELECT` clause and must include a `GROUP BY time()` clause. +See [Continuous Queries](/influxdb/v1.3/query_language/continuous_queries/). + + +Related entries: [function](/influxdb/v1.3/concepts/glossary/#function) + +## database +A logical container for users, retention policies, continuous queries, and time series data. + +Related entries: [continuous query](/influxdb/v1.3/concepts/glossary/#continuous-query-cq), [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp), [user](/influxdb/v1.3/concepts/glossary/#user) + +## duration +The attribute of the retention policy that determines how long InfluxDB stores data. +Data older than the duration are automatically dropped from the database. +See [Database Management](/influxdb/v1.3/query_language/database_management/#create-retention-policies-with-create-retention-policy) for how to set duration. + +Related entries: [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) + +## field +The key-value pair in InfluxDB's data structure that records metadata and the actual data value. +Fields are required in InfluxDB's data structure and they are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant relative to tags. + +*Query tip:* Compare fields to tags; tags are indexed. + +Related entries: [field key](/influxdb/v1.3/concepts/glossary/#field-key), [field set](/influxdb/v1.3/concepts/glossary/#field-set), [field value](/influxdb/v1.3/concepts/glossary/#field-value), [tag](/influxdb/v1.3/concepts/glossary/#tag) + +## field key +The key part of the key-value pair that makes up a field. +Field keys are strings and they store metadata. + +Related entries: [field](/influxdb/v1.3/concepts/glossary/#field), [field set](/influxdb/v1.3/concepts/glossary/#field-set), [field value](/influxdb/v1.3/concepts/glossary/#field-value), [tag key](/influxdb/v1.3/concepts/glossary/#tag-key) + +## field set +The collection of field keys and field values on a point. + +Related entries: [field](/influxdb/v1.3/concepts/glossary/#field), [field key](/influxdb/v1.3/concepts/glossary/#field-key), [field value](/influxdb/v1.3/concepts/glossary/#field-value), [point](/influxdb/v1.3/concepts/glossary/#point) + +## field value +The value part of the key-value pair that makes up a field. +Field values are the actual data; they can be strings, floats, integers, or booleans. +A field value is always associated with a timestamp. + +Field values are not indexed - queries on field values scan all points that match the specified time range and, as a result, are not performant. + +*Query tip:* Compare field values to tag values; tag values are indexed. + +Related entries: [field](/influxdb/v1.3/concepts/glossary/#field), [field key](/influxdb/v1.3/concepts/glossary/#field-key), [field set](/influxdb/v1.3/concepts/glossary/#field-set), [tag value](/influxdb/v1.3/concepts/glossary/#tag-value), [timestamp](/influxdb/v1.3/concepts/glossary/#timestamp) + +## function +InfluxQL aggregations, selectors, and transformations. +See [InfluxQL Functions](/influxdb/v1.3/query_language/functions/) for a complete list of InfluxQL functions. + +Related entries: [aggregation](/influxdb/v1.3/concepts/glossary/#aggregation), [selector](/influxdb/v1.3/concepts/glossary/#selector), [transformation](/influxdb/v1.3/concepts/glossary/#transformation) + +## identifier +Tokens that refer to continuous query names, database names, field keys, +measurement names, retention policy names, subscription names, tag keys, and +user names. +See [Query Language Specification](/influxdb/v1.3/query_language/spec/#identifiers). + +Related entries: +[database](/influxdb/v1.3/concepts/glossary/#database), +[field key](/influxdb/v1.3/concepts/glossary/#field-key), +[measurement](/influxdb/v1.3/concepts/glossary/#measurement), +[retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp), +[tag key](/influxdb/v1.3/concepts/glossary/#tag-key), +[user](/influxdb/v1.3/concepts/glossary/#user) + +## line protocol +The text based format for writing points to InfluxDB. See [Line Protocol](/influxdb/v1.3/write_protocols/). + +## measurement +The part of InfluxDB's structure that describes the data stored in the associated fields. +Measurements are strings. + +Related entries: [field](/influxdb/v1.3/concepts/glossary/#field), [series](/influxdb/v1.3/concepts/glossary/#series) + +## metastore +Contains internal information about the status of the system. +The metastore contains the user information, databases, retention policies, shard metadata, continuous queries, and subscriptions. + +Related entries: [database](/influxdb/v1.3/concepts/glossary/#database), [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp), [user](/influxdb/v1.3/concepts/glossary/#user) + +## node +An independent `influxd` process. + +Related entries: [server](/influxdb/v1.3/concepts/glossary/#server) + +## now() +The local server's nanosecond timestamp. + +## point +The part of InfluxDB's data structure that consists of a single collection of fields in a series. +Each point is uniquely identified by its series and timestamp. + +You cannot store more than one point with the same timestamp in the same series. +Instead, when you write a new point to the same series with the same timestamp as an existing point in that series, the field set becomes the union of the old field set and the new field set, where any ties go to the new field set. +For an example, see [Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points). + +Related entries: [field set](/influxdb/v1.3/concepts/glossary/#field-set), [series](/influxdb/v1.3/concepts/glossary/#series), [timestamp](/influxdb/v1.3/concepts/glossary/#timestamp) + +## points per second +A deprecated measurement of the rate at which data are persisted to InfluxDB. +The schema allows and even encourages the recording of multiple metric vales per point, rendering points per second ambiguous. + +Write speeds are generally quoted in values per second, a more precise metric. + +Related entries: [point](/influxdb/v1.3/concepts/glossary/#point), [schema](/influxdb/v1.3/concepts/glossary/#schema), [values per second](/influxdb/v1.3/concepts/glossary/#values-per-second) + +## query +An operation that retrieves data from InfluxDB. +See [Data Exploration](/influxdb/v1.3/query_language/data_exploration/), [Schema Exploration](/influxdb/v1.3/query_language/schema_exploration/), [Database Management](/influxdb/v1.3/query_language/database_management/). + +## replication factor +The attribute of the retention policy that determines how many copies of the data are stored in the cluster. +InfluxDB replicates data across `N` data nodes, where `N` is the replication factor. + +{{% warn %}} Replication factors do not serve a purpose with single node instances. +{{% /warn %}} + +Related entries: [duration](/influxdb/v1.3/concepts/glossary/#duration), [node](/influxdb/v1.3/concepts/glossary/#node), [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) + +## retention policy (RP) +The part of InfluxDB's data structure that describes for how long InfluxDB keeps data (duration), how many copies of this data is stored in the cluster (replication factor), and the time range covered by shard groups (shard group duration). +RPs are unique per database and along with the measurement and tag set define a series. + +When you create a database, InfluxDB automatically creates a retention policy called `autogen` with an infinite duration, a replication factor set to one, and a shard group duration set to seven days. +See [Database Management](/influxdb/v1.3/query_language/database_management/#retention-policy-management) for retention policy management. + +{{% warn %}} Replication factors do not serve a purpose with single node instances. +{{% /warn %}} + +Related entries: [duration](/influxdb/v1.3/concepts/glossary/#duration), [measurement](/influxdb/v1.3/concepts/glossary/#measurement), [replication factor](/influxdb/v1.3/concepts/glossary/#replication-factor), [series](/influxdb/v1.3/concepts/glossary/#series), [shard duration](/influxdb/v1.3/concepts/glossary/#shard-duration), [tag set](/influxdb/v1.3/concepts/glossary/#tag-set) + +## schema +How the data are organized in InfluxDB. +The fundamentals of the InfluxDB schema are databases, retention policies, series, measurements, tag keys, tag values, and field keys. +See [Schema Design](/influxdb/v1.3/concepts/schema_and_data_layout/) for more information. + +Related entries: [database](/influxdb/v1.3/concepts/glossary/#database), [field key](/influxdb/v1.3/concepts/glossary/#field-key), [measurement](/influxdb/v1.3/concepts/glossary/#measurement), [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp), [series](/influxdb/v1.3/concepts/glossary/#series), [tag key](/influxdb/v1.3/concepts/glossary/#tag-key), [tag value](/influxdb/v1.3/concepts/glossary/#tag-value) + +## selector +An InfluxQL function that returns a single point from the range of specified points. +See [InfluxQL Functions](/influxdb/v1.3/query_language/functions/#selectors) for a complete list of the available and upcoming selectors. + +Related entries: [aggregation](/influxdb/v1.3/concepts/glossary/#aggregation), [function](/influxdb/v1.3/concepts/glossary/#function), [transformation](/influxdb/v1.3/concepts/glossary/#transformation) + +## series +The collection of data in InfluxDB's data structure that share a measurement, tag set, and retention policy. + + +> **Note:** The field set is not part of the series identification! + +Related entries: [field set](/influxdb/v1.3/concepts/glossary/#field-set), [measurement](/influxdb/v1.3/concepts/glossary/#measurement), [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp), [tag set](/influxdb/v1.3/concepts/glossary/#tag-set) + +## series cardinality +The number of unique database, measurement, and tag set combinations in an InfluxDB instance. + +For example, assume that an InfluxDB instance has a single database and one measurement. +The single measurement has two tag keys: `email` and `status`. +If there are three different `email`s, and each email address is associated with two +different `status`es then the series cardinality for the measurement is 6 +(3 * 2 = 6): + +| email | status | +| :-------------------- | :----- | +| lorr@influxdata.com | start | +| lorr@influxdata.com | finish | +| marv@influxdata.com | start | +| marv@influxdata.com | finish | +| cliff@influxdata.com | start | +| cliff@influxdata.com | finish | + +Note that, in some cases, simply performing that multiplication may overestimate series cardinality because of the presence of dependent tags. +Dependent tags are tags that are scoped by another tag and do not increase series +cardinality. +If we add the tag `firstname` to the example above, the series cardinality +would not be 18 (3 * 2 * 3 = 18). +It would remain unchanged at 6, as `firstname` is already scoped by the `email` tag: + +| email | status | firstname | +| :-------------------- | :----- | :-------- | +| lorr@influxdata.com | start | lorraine | +| lorr@influxdata.com | finish | lorraine | +| marv@influxdata.com | start | marvin | +| marv@influxdata.com | finish | marvin | +| cliff@influxdata.com | start | clifford | +| cliff@influxdata.com | finish | clifford | + +See [Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-can-i-query-for-series-cardinality) for how to query InfluxDB for series +cardinality. + +Related entries: [tag set](/influxdb/v1.3/concepts/glossary/#tag-set), [measurement](/influxdb/v1.3/concepts/glossary/#measurement), [tag key](/influxdb/v1.3/concepts/glossary/#tag-key) + +## server +A machine, virtual or physical, that is running InfluxDB. +There should only be one InfluxDB process per server. + +Related entries: [node](/influxdb/v1.3/concepts/glossary/#node) + +## shard + +A shard contains the actual encoded and compressed data, and is represented by a TSM file on disk. +Every shard belongs to one and only one shard group. +Multiple shards may exist in a single shard group. +Each shard contains a specific set of series. +All points falling on a given series in a given shard group will be stored in the same shard (TSM file) on disk. + +Related entries: [series](/influxdb/v1.3/concepts/glossary/#series), [shard duration](/influxdb/v1.3/concepts/glossary/#shard-duration), [shard group](/influxdb/v1.3/concepts/glossary/#shard-group), [tsm](/influxdb/v1.3/concepts/glossary/#tsm-time-structured-merge-tree) + +## shard duration + +The shard duration determines how much time each shard group spans. +The specific interval is determined by the `SHARD DURATION` of the retention policy. +See [Retention Policy management](/influxdb/v1.3/query_language/database_management/#retention-policy-management) for more information. + +For example, given a retention policy with `SHARD DURATION` set to `1w`, each shard group will span a single week and contain all points with timestamps in that week. + +Related entries: [database](/influxdb/v1.3/concepts/glossary/#database), [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp), [series](/influxdb/v1.3/concepts/glossary/#series), [shard](/influxdb/v1.3/concepts/glossary/#shard), [shard group](/influxdb/v1.3/concepts/glossary/#shard-group) + +## shard group + +Shard groups are logical containers for shards. +Shard groups are organized by time and retention policy. +Every retention policy that contains data has at least one associated shard group. +A given shard group contains all shards with data for the interval covered by the shard group. +The interval spanned by each shard group is the shard duration. + +Related entries: [database](/influxdb/v1.3/concepts/glossary/#database), [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp), [series](/influxdb/v1.3/concepts/glossary/#series), [shard](/influxdb/v1.3/concepts/glossary/#shard), [shard duration](/influxdb/v1.3/concepts/glossary/#shard-duration) + +## subscription +Subscriptions allow [Kapacitor](/kapacitor/latest/) to receive data from InfluxDB in a push model rather than the pull model based on querying data. +When Kapacitor is configured to work with InfluxDB, the subscription will automatically push every write for the subscribed database from InfluxDB to Kapacitor. +Subscriptions can use TCP or UDP for transmitting the writes. + +## tag +The key-value pair in InfluxDB's data structure that records metadata. +Tags are an optional part of InfluxDB's data structure but they are useful for storing commonly-queried metadata; tags are indexed so queries on tags are performant. +*Query tip:* Compare tags to fields; fields are not indexed. + +Related entries: [field](/influxdb/v1.3/concepts/glossary/#field), [tag key](/influxdb/v1.3/concepts/glossary/#tag-key), [tag set](/influxdb/v1.3/concepts/glossary/#tag-set), [tag value](/influxdb/v1.3/concepts/glossary/#tag-value) + +## tag key +The key part of the key-value pair that makes up a tag. +Tag keys are strings and they store metadata. +Tag keys are indexed so queries on tag keys are performant. + +*Query tip:* Compare tag keys to field keys; field keys are not indexed. + +Related entries: [field key](/influxdb/v1.3/concepts/glossary/#field-key), [tag](/influxdb/v1.3/concepts/glossary/#tag), [tag set](/influxdb/v1.3/concepts/glossary/#tag-set), [tag value](/influxdb/v1.3/concepts/glossary/#tag-value) + +## tag set +The collection of tag keys and tag values on a point. + +Related entries: [point](/influxdb/v1.3/concepts/glossary/#point), [series](/influxdb/v1.3/concepts/glossary/#series), [tag](/influxdb/v1.3/concepts/glossary/#tag), [tag key](/influxdb/v1.3/concepts/glossary/#tag-key), [tag value](/influxdb/v1.3/concepts/glossary/#tag-value) + +## tag value +The value part of the key-value pair that makes up a tag. +Tag values are strings and they store metadata. +Tag values are indexed so queries on tag values are performant. + + +Related entries: [tag](/influxdb/v1.3/concepts/glossary/#tag), [tag key](/influxdb/v1.3/concepts/glossary/#tag-key), [tag set](/influxdb/v1.3/concepts/glossary/#tag-set) + +## timestamp +The date and time associated with a point. +All time in InfluxDB is UTC. + +For how to specify time when writing data, see [Write Syntax](/influxdb/v1.3/write_protocols/write_syntax/). +For how to specify time when querying data, see [Data Exploration](/influxdb/v1.3/query_language/data_exploration/#time-syntax). + +Related entries: [point](/influxdb/v1.3/concepts/glossary/#point) + +## transformation +An InfluxQL function that returns a value or a set of values calculated from specified points, but does not return an aggregated value across those points. +See [InfluxQL Functions](/influxdb/v1.3/query_language/functions/#transformations) for a complete list of the available and upcoming aggregations. + +Related entries: [aggregation](/influxdb/v1.3/concepts/glossary/#aggregation), [function](/influxdb/v1.3/concepts/glossary/#function), [selector](/influxdb/v1.3/concepts/glossary/#selector) + +## tsm (Time Structured Merge tree) +The purpose-built data storage format for InfluxDB. TSM allows for greater compaction and higher write and read throughput than existing B+ or LSM tree implementations. See [Storage Engine](http://docs.influxdata.com/influxdb/v1.3/concepts/storage_engine/) for more. + +## user +There are two kinds of users in InfluxDB: + +* *Admin users* have `READ` and `WRITE` access to all databases and full access to administrative queries and user management commands. +* *Non-admin users* have `READ`, `WRITE`, or `ALL` (both `READ` and `WRITE`) access per database. + +When authentication is enabled, InfluxDB only executes HTTP requests that are sent with a valid username and password. +See [Authentication and Authorization](/influxdb/v1.3/query_language/authentication_and_authorization/). + +## values per second +The preferred measurement of the rate at which data are persisted to InfluxDB. Write speeds are generally quoted in values per second. + +To calculate the values per second rate, multiply the number of points written per second by the number of values stored per point. For example, if the points have four fields each, and a batch of 5000 points is written 10 times per second, then the values per second rate is `4 field values per point * 5000 points per batch * 10 batches per second = 200,000 values per second`. + +Related entries: [batch](/influxdb/v1.3/concepts/glossary/#batch), [field](/influxdb/v1.3/concepts/glossary/#field), [point](/influxdb/v1.3/concepts/glossary/#point), [points per second](/influxdb/v1.3/concepts/glossary/#points-per-second) + +## wal (Write Ahead Log) +The temporary cache for recently written points. To reduce the frequency with which the permanent storage files are accessed, InfluxDB caches new points in the WAL until their total size or age triggers a flush to more permanent storage. This allows for efficient batching of the writes into the TSM. + +Points in the WAL can be queried, and they persist through a system reboot. On process start, all points in the WAL must be flushed before the system accepts new writes. + +Related entries: [tsm](/influxdb/v1.3/concepts/glossary/#tsm-time-structured-merge-tree) + + diff --git a/content/influxdb/v1.3/concepts/insights_tradeoffs.md b/content/influxdb/v1.3/concepts/insights_tradeoffs.md new file mode 100644 index 000000000..d8ab3045d --- /dev/null +++ b/content/influxdb/v1.3/concepts/insights_tradeoffs.md @@ -0,0 +1,42 @@ +--- +title: Design Insights and Tradeoffs in InfluxDB +menu: + influxdb_1_3: + weight: 60 + parent: Concepts +--- + +InfluxDB is a time-series database. +Optimizing for this use-case entails some tradeoffs, primarily to increase performance at the cost of functionality. +Below is a list of some of those design insights that lead to tradeoffs: + +1. For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times. + * *Pro:* Simplified [conflict resolution](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points) increases write performance + * *Con:* Cannot store duplicate data; may overwrite data in rare circumstances +1. Deletes are a rare occurrence. +When they do occur it is almost always against large ranges of old data that are cold for writes. + * *Pro:* Restricting access to deletes allows for increased query and write performance + * *Con:* Delete functionality is significantly restricted +1. Updates to existing data are a rare occurrence and contentious updates never happen. +Time series data is predominantly new data that is never updated. + * *Pro:* Restricting access to updates allows for increased query and write performance + * *Con:* Update functionality is significantly restricted +1. The vast majority of writes are for data with very recent timestamps and the data is added in time ascending order. + * *Pro:* Adding data in time ascending order is significantly more performant + * *Con:* Writing points with random times or with time not in ascending order is significantly less performant +1. Scale is critical. +The database must be able to handle a *high* volume of reads and writes. + * *Pro:* The database can handle a *high* volume of reads and writes + * *Con:* The InfluxDB development team was forced to make tradeoffs to increase performance +1. Being able to write and query the data is more important than having a strongly consistent view. + * *Pro:* Writing and querying the database can be done by multiple clients and at high loads + * *Con:* Query returns may not include the most recent points if database is under heavy load +1. Many time [series](/influxdb/v1.3/concepts/glossary/#series) are ephemeral. +There are often time series that appear only for a few hours and then go away, e.g. +a new host that gets started and reports for a while and then gets shut down. + * *Pro:* InfluxDB is good at managing discontinuous data + * *Con:* Schema-less design means that some database functions are not supported e.g. +there are no cross table joins +1. No one point is too important. + * *Pro:* InfluxDB has very powerful tools to deal with aggregate data and large data sets + * *Con:* Points don't have IDs in the traditional sense, they are differentiated by timestamp and series diff --git a/content/influxdb/v1.3/concepts/key_concepts.md b/content/influxdb/v1.3/concepts/key_concepts.md new file mode 100644 index 000000000..c2b04e0c5 --- /dev/null +++ b/content/influxdb/v1.3/concepts/key_concepts.md @@ -0,0 +1,197 @@ +--- +title: Key Concepts +menu: + influxdb_1_3: + weight: 1 + parent: Concepts +--- + +Before diving into InfluxDB it's good to get acquainted with some of the key concepts of the database. +This document provides a gentle introduction to those concepts and common InfluxDB terminology. +We've provided a list below of all the terms we'll cover, but we recommend reading this document from start to finish to gain a more general understanding of our favorite time series database. + + + + + + + + + + + + + + + + + + + + + + +
databasefield keyfield set
field valuemeasurementpoint
retention policyseriestag key
tag settag valuetimestamp
+ +Check out the [Glossary](/influxdb/v1.3/concepts/glossary/) if you prefer the cold, hard facts. + +### Sample data +The next section references the data printed out below. +The data is fictional, but represents a believable setup in InfluxDB. +They show the number of butterflies and honeybees counted by two scientists (`langstroth` and `perpetua`) in two locations (location `1` and location `2`) over the time period from August 18, 2015 at midnight through August 18, 2015 at 6:12 AM. +Assume that the data lives in a database called `my_database` and are subject to the `autogen` retention policy (more on databases and retention policies to come). + +*Hint:* Hover over the links for tooltips to get acquainted with InfluxDB terminology and the layout. + +name: census +\------------------------------------- +time                                      butterflies     honeybees     location     scientist +2015-08-18T00:00:00Z      12                   23                    1                 langstroth +2015-08-18T00:00:00Z      1                     30                    1                 perpetua +2015-08-18T00:06:00Z      11                   28                    1                 langstroth +2015-08-18T00:06:00Z   3                     28                    1                 perpetua +2015-08-18T05:54:00Z      2                     11                    2                 langstroth +2015-08-18T06:00:00Z      1                     10                    2                 langstroth +2015-08-18T06:06:00Z      8                     23                    2                 perpetua +2015-08-18T06:12:00Z      7                     22                    2                 perpetua + +### Discussion +Now that you've seen some sample data in InfluxDB this section covers what it all means. + +InfluxDB is a time series database so it makes sense to start with what is at the root of everything we do: time. +In the data above there's a column called `time` - all data in InfluxDB have that column. +`time` stores timestamps, and the *timestamp* shows the date and time, in [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) UTC, associated with particular data. + +The next two columns, called `butterflies` and `honeybees`, are fields. +Fields are made up of field keys and field values. +*Field keys* (`butterflies` and `honeybees`) are strings and they store metadata; the field key `butterflies` tells us that the field values `12`-`7` refer to butterflies and the field key `honeybees` tells us that the field values `23`-`22` refer to, well, honeybees. + +*Field values* are your data; they can be strings, floats, integers, or booleans, and, because InfluxDB is a time series database, a field value is always associated with a timestamp. +The field values in the sample data are: + +``` +12 23 +1 30 +11 28 +3 28 +2 11 +1 10 +8 23 +7 22 +``` + +In the data above, the collection of field-key and field-value pairs make up a *field set*. +Here are all eight field sets in the sample data: + +* `butterflies = 12 honeybees = 23` +* `butterflies = 1 honeybees = 30` +* `butterflies = 11 honeybees = 28` +* `butterflies = 3 honeybees = 28` +* `butterflies = 2 honeybees = 11` +* `butterflies = 1 honeybees = 10` +* `butterflies = 8 honeybees = 23` +* `butterflies = 7 honeybees = 22` + +Fields are a required piece of InfluxDB's data structure - you cannot have data in InfluxDB without fields. +It's also important to note that fields are not indexed. +[Queries](/influxdb/v1.3/concepts/glossary/#query) that use field values as filters must scan all values that match the other conditions in the query. +As a result, those queries are not performant relative to queries on tags (more on tags below). +In general, fields should not contain commonly-queried metadata. + + +The last two columns in the sample data, called `location` and `scientist`, are tags. +Tags are made up of tag keys and tag values. +Both *tag keys* and *tag values* are stored as strings and record metadata. +The tag keys in the sample data are `location` and `scientist`. +The tag key `location` has two tag values: `1` and `2`. +The tag key `scientist` also has two tag values: `langstroth` and `perpetua`. + +In the data above, the *tag set* is the different combinations of all the tag key-value pairs. +The four tag sets in the sample data are: + +* `location = 1`, `scientist = langstroth` +* `location = 2`, `scientist = langstroth` +* `location = 1`, `scientist = perpetua` +* `location = 2`, `scientist = perpetua` + +Tags are optional. +You don't need to have tags in your data structure, but it's generally a good idea to make use of them because, unlike fields, tags are indexed. +This means that queries on tags are faster and that tags are ideal for storing commonly-queried metadata. + +> **Why indexing matters: The schema case study** + +> Say you notice that most of your queries focus on the values of the field keys `honeybees` and `butterflies`: + +> `SELECT * FROM "census" WHERE "butterflies" = 1` +> `SELECT * FROM "census" WHERE "honeybees" = 23` + +> Because fields aren't indexed, InfluxDB scans every value of `butterflies` in the first query and every value of `honeybees` in the second query before it provides a response. +That behavior can hurt query response times - especially on a much larger scale. +To optimize your queries, it may be beneficial to rearrange your [schema](/influxdb/v1.3/concepts/glossary/#schema) such that the fields (`butterflies` and `honeybees`) become the tags and the tags (`location` and `scientist`) become the fields: + +> name: census +\------------------------------------- +time                                      location     scientist      butterflies     honeybees +2015-08-18T00:00:00Z      1                 langstroth    12                   23 +2015-08-18T00:00:00Z      1                 perpetua      1                     30 +2015-08-18T00:06:00Z      1                 langstroth    11                   28 +2015-08-18T00:06:00Z   1                 perpetua      3                     28 +2015-08-18T05:54:00Z      2                 langstroth    2                     11 +2015-08-18T06:00:00Z      2                 langstroth    1                     10 +2015-08-18T06:06:00Z      2                 perpetua      8                     23 +2015-08-18T06:12:00Z      2                 perpetua      7                     22 + +> Now that `butterflies` and `honeybees` are tags, InfluxDB won't have to scan every one of their values when it performs the queries above - this means that your queries are even faster. + +The *measurement* acts as a container for tags, fields, and the `time` column, and the measurement name is the description of the data that are stored in the associated fields. +Measurement names are strings, and, for any SQL users out there, a measurement is conceptually similar to a table. +The only measurement in the sample data is `census`. +The name `census` tells us that the field values record the number of `butterflies` and `honeybees` - not their size, direction, or some sort of happiness index. + +A single measurement can belong to different retention policies. +A *retention policy* describes how long InfluxDB keeps data (`DURATION`) and how many copies of this data is stored in the cluster (`REPLICATION`). +If you're interested in reading more about retention policies, check out [Database Management](/influxdb/v1.3/query_language/database_management/#retention-policy-management). + +{{% warn %}} Replication factors do not serve a purpose with single node instances. +{{% /warn %}} + +In the sample data, everything in the `census` measurement belongs to the `autogen` retention policy. +InfluxDB automatically creates that retention policy; it has an infinite duration and a replication factor set to one. + +Now that you're familiar with measurements, tag sets, and retention policies it's time to discuss series. +In InfluxDB, a *series* is the collection of data that share a retention policy, measurement, and tag set. +The data above consist of four series: + +| Arbitrary series number | Retention policy | Measurement | Tag set | +|---|---|---|---| +| series 1 | `autogen` | `census` | `location = 1`,`scientist = langstroth` | +| series 2 | `autogen` | `census` | `location = 2`,`scientist = langstroth` | +| series 3 | `autogen` | `census` | `location = 1`,`scientist = perpetua` | +| series 4 | `autogen` | `census` | `location = 2`,`scientist = perpetua` | + +Understanding the concept of a series is essential when designing your [schema](/influxdb/v1.3/concepts/glossary/#schema) and when working with your data in InfluxDB. + +Finally, a *point* is the field set in the same series with the same timestamp. +For example, here's a single point: +``` +name: census +----------------- +time butterflies honeybees location scientist +2015-08-18T00:00:00Z 1 30 1 perpetua +``` + +The series in the example is defined by the retention policy (`autogen`), the measurement (`census`), and the tag set (`location = 1`, `scientist = perpetua`). +The timestamp for the point is `2015-08-18T00:00:00Z`. + +All of the stuff we've just covered is stored in a database - the sample data are in the database `my_database`. +An InfluxDB *database* is similar to traditional relational databases and serves as a logical container for users, retention policies, continuous queries, and, of course, your time series data. +See [Authentication and Authorization](/influxdb/v1.3/query_language/authentication_and_authorization/) and [Continuous Queries](/influxdb/v1.3/query_language/continuous_queries/) for more on those topics. + +Databases can have several users, continuous queries, retention policies, and measurements. +InfluxDB is a schemaless database which means it's easy to add new measurements, tags, and fields at any time. +It's designed to make working with time series data awesome. + +You made it! +You've covered the fundamental concepts and terminology in InfluxDB. +If you're just starting out, we recommend taking a look at [Getting Started](/influxdb/v1.3/introduction/getting_started/) and the [Writing Data](/influxdb/v1.3/guides/writing_data/) and [Querying Data](/influxdb/v1.3/guides/querying_data/) guides. +May our time series database serve you well 🕔. diff --git a/content/influxdb/v1.3/concepts/schema_and_data_layout.md b/content/influxdb/v1.3/concepts/schema_and_data_layout.md new file mode 100644 index 000000000..f8e0bde01 --- /dev/null +++ b/content/influxdb/v1.3/concepts/schema_and_data_layout.md @@ -0,0 +1,173 @@ +--- +title: Schema Design +menu: + influxdb_1_3: + weight: 70 + parent: Concepts +--- + +Every InfluxDB use case is special and your [schema](/influxdb/v1.3/concepts/glossary/#schema) will reflect that uniqueness. +There are, however, general guidelines to follow and pitfalls to avoid when designing your schema. + + + + + + + + +
General RecommendationsEncouraged Schema DesignDiscouraged Schema DesignShard Group Duration Management
+ +# General Recommendations + +## Encouraged Schema Design + +In no particular order, we recommend that you: + +### *Encode meta data in tags* + +[Tags](/influxdb/v1.3/concepts/glossary/#tag) are indexed and [fields](/influxdb/v1.3/concepts/glossary/#field) are not indexed. +This means that queries on tags are more performant than those on fields. + +In general, your queries should guide what gets stored as a tag and what gets stored as a field: + +* Store data in tags if they're commonly-queried meta data +* Store data in tags if you plan to use them with `GROUP BY()` +* Store data in fields if you plan to use them with an [InfluxQL function](/influxdb/v1.3/query_language/functions/) +* Store data in fields if you *need* them to be something other than a string - [tag values](/influxdb/v1.3/concepts/glossary/#tag-value) are always interpreted as strings + +### *Avoid using InfluxQL Keywords as identifier names* + +This isn't necessary, but it simplifies writing queries; you won't have to wrap those identifiers in double quotes. +Identifiers are database names, [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) names, [user](/influxdb/v1.3/concepts/glossary/#user) names, [measurement](/influxdb/v1.3/concepts/glossary/#measurement) names, [tag keys](/influxdb/v1.3/concepts/glossary/#tag-key), and [field keys](/influxdb/v1.3/concepts/glossary/#field-key). +See [InfluxQL Keywords](https://github.com/influxdata/influxql/blob/master/README.md#keywords) for words to avoid. + +Note that you will also need to wrap identifiers in double quotes in queries if they contain characters other than `[A-z,_]`. + +## Discouraged Schema Design + +In no particular order, we recommend that you: + +### *Don't have too many series* + +[Tags](/influxdb/v1.3/concepts/glossary/#tag) containing highly variable information like UUIDs, hashes, and random strings will lead to a large number of series in the database, known colloquially as high series cardinality. +High series cardinality is a primary driver of high memory usage for many database workloads. + +See [Hardware Sizing Guidelines](/influxdb/v1.3/guides/hardware_sizing/#general-hardware-guidelines-for-a-single-node) for [series cardinality](/influxdb/v1.3/concepts/glossary/#series-cardinality) recommendations based on your hardware. If the system has memory constraints, consider storing high-cardinality data as a field rather than a tag. + +### *Don't encode data in measurement names* + +In general, taking this step will simplify your queries. +InfluxDB queries merge data that fall within the same [measurement](/influxdb/v1.3/concepts/glossary/#measurement); it's better to differentiate data with [tags](/influxdb/v1.3/concepts/glossary/#tag) than with detailed measurement names. + +_Example:_ + +Consider the following schema represented by line protocol. + +``` +Schema 1 - Data encoded in the measurement name +------------- +blueberries.plot-1.north temp=50.1 1472515200000000000 +blueberries.plot-2.midwest temp=49.8 1472515200000000000 +``` + +The long measurement names (`blueberries.plot-1.north`) with no tags are similar to Graphite metrics. +Encoding information like `plot` and `region` in the measurement name will make the data much harder to query. + +For instance, calculating the average temperature of both plots 1 and 2 would not be possible with schema 1. +Compare this to the following schema represented in line protocol. + +``` +Schema 2 - Data encoded in tags +------------- +weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000 +weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000 +``` + +The following queries calculate the average of `temp` for blueberries that fall in the `north` region. +While both queries are relatively simple, use of the regular expression make certain queries much more complicated or impossible. + +``` +# Schema 1 - Query for data encoded in the measurement name +> SELECT mean("temp") FROM /\.north$/ + +# Schema 2 - Query for data encoded in tags +> SELECT mean("temp") FROM "weather_sensor" WHERE "region" = 'north' +``` + +### *Don't put more than one piece of information in one tag* + +Similar to the point above, splitting a single tag with multiple pieces into separate tags will simplify your queries and reduce the need for regular expressions. + +_Example:_ + +Consider the following schema represented by line protocol. + +``` +Schema 1 - Multiple data encoded in a single tag +------------- +weather_sensor,crop=blueberries,location=plot-1.north temp=50.1 1472515200000000000 +weather_sensor,crop=blueberries,location=plot-2.midwest temp=49.8 1472515200000000000 +``` + +The above data encodes multiple separate parameters, the `plot` and `region` into a long tag value (`plot-1.north`). +Compare this to the following schema represented in line protocol. + +``` +Schema 2 - Data encoded in multiple tags +------------- +weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000 +weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000 +``` + +The following queries calculate the average of `temp` for blueberries that fall in the `north` region. +While both queries are similar, the use of multiple tags in Schema 2 avoids the use of a regular expressions. + +``` +# Schema 1 - Query for multiple data encoded in a single tag +> SELECT mean("temp") FROM "weather_sensor" WHERE location =~ /\.north$/ + +# Schema 2 - Query for data encoded in multiple tags +> SELECT mean("temp") FROM "weather_sensor" WHERE region = 'north' +``` + +# Shard Group Duration Management + +## Shard Group Duration Overview + +InfluxDB stores data in shard groups. +Shard groups are organized by [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) (RP) and store data with timestamps that fall within a specific time interval. +The length of that time interval is called the [shard group duration](/influxdb/v1.3/concepts/glossary/#shard-duration). + +By default, the shard group duration is determined by the RP's [duration](/influxdb/v1.3/concepts/glossary/#duration): + +| RP Duration | Shard Group Duration | +|---|---| +| < 2 days | 1 hour | +| >= 2 days and <= 6 months | 1 day | +| > 6 months | 7 days | + +The shard group duration is also configurable per RP. +See [Retention Policy Management](/influxdb/v1.3/query_language/database_management/#retention-policy-management) for how to configure the +shard group duration. + +## Shard Group Duration Recommendations + +In general, shorter shard group durations allow the system to efficiently drop data. +When InfluxDB enforces an RP it drops entire shard groups, not individual data points. +For example, if your RP has a duration of one day, it makes sense to have a shard group duration of one hour; InfluxDB will drop an hour worth of data every hour. + +If your RP's duration is greater than six months, there's no need to have a short shard group duration. +In fact, increasing the shard group duration beyond the default seven day value can improve compression, improve write speed, and decrease the fixed iterator overhead per shard group. +Shard group durations of 50 years and over, for example, are acceptable configurations. + +> **Note:** Note that `INF` (infinite) is not a valid duration [when configuring](/influxdb/v1.3/query_language/database_management/#retention-policy-management) +the shard group duration. +As a workaround, specify a `1000w` duration to achieve an extremely long shard group +duration. + +We recommend configuring the shard group duration such that: + +* it is two times your longest typical query's time range +* each shard group has at least 100,000 [points](/influxdb/v1.3/concepts/glossary/#point) per shard group +* each shard group has at least 1,000 points per [series](/influxdb/v1.3/concepts/glossary/#series) diff --git a/content/influxdb/v1.3/concepts/storage_engine.md b/content/influxdb/v1.3/concepts/storage_engine.md new file mode 100644 index 000000000..7e2c57dd9 --- /dev/null +++ b/content/influxdb/v1.3/concepts/storage_engine.md @@ -0,0 +1,435 @@ +--- +title: Storage Engine + +menu: + influxdb_1_3: + weight: 90 + parent: Concepts +--- + +# The InfluxDB Storage Engine and the Time-Structured Merge Tree (TSM) + +The new InfluxDB storage engine looks very similar to a LSM Tree. +It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree. +TSM files contain sorted, compressed series data. + +InfluxDB will create a [shard](/influxdb/v1.3/concepts/glossary/#shard) for each block of time. +For example, if you have a [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) with an unlimited duration, shards will be created for each 7 day block of time. +Each of these shards maps to an underlying storage engine database. +Each of these databases has its own [WAL](/influxdb/v1.3/concepts/glossary/#wal-write-ahead-log) and TSM files. + +We'll dig into each of these parts of the storage engine. + +## Storage Engine + +The storage engine ties a number of components together and provides the external interface for storing and querying series data. It is composed of a number of components that each serve a particular role: + +* In-Memory Index - The in-memory index is a shared index across shards that provides the quick access to [measurements](/influxdb/v1.3/concepts/glossary/#measurement), [tags](/influxdb/v1.3/concepts/glossary/#tag), and [series](/influxdb/v1.3/concepts/glossary/#series). The index is used by the engine, but is not specific to the storage engine itself. +* WAL - The WAL is a write-optimized storage format that allows for writes to be durable, but not easily queryable. Writes to the WAL are appended to segments of a fixed size. +* Cache - The Cache is an in-memory representation of the data stored in the WAL. It is queried at runtime and merged with the data stored in TSM files. +* TSM Files - TSM files store compressed series data in a columnar format. +* FileStore - The FileStore mediates access to all TSM files on disk. It ensures that TSM files are installed atomically when existing ones are replaced as well as removing TSM files that are no longer used. +* Compactor - The Compactor is responsible for converting less optimized Cache and TSM data into more read-optimized formats. It does this by compressing series, removing deleted data, optimizing indices and combining smaller files into larger ones. +* Compaction Planner - The Compaction Planner determines which TSM files are ready for a compaction and ensures that multiple concurrent compactions do not interfere with each other. +* Compression - Compression is handled by various Encoders and Decoders for specific data types. Some encoders are fairly static and always encode the same type the same way; others switch their compression strategy based on the shape of the data. +* Writers/Readers - Each file type (WAL segment, TSM files, tombstones, etc..) has Writers and Readers for working with the formats. + +### Write Ahead Log (WAL) + +The WAL is organized as a bunch of files that look like `_000001.wal`. +The file numbers are monotonically increasing and referred to as WAL segments. +When a segment reaches 10MB in size, it is closed and a new one is opened. Each WAL segment stores multiple compressed blocks of writes and deletes. + +When a write comes in the new points are serialized, compressed using Snappy, and written to a WAL file. +The file is `fsync`'d and the data is added to an in-memory index before a success is returned. +This means that batching points together is required to achieve high throughput performance. +(Optimal batch size seems to be 5,000-10,000 points per batch for many use cases.) + +Each entry in the WAL follows a [TLV standard](https://en.wikipedia.org/wiki/Type-length-value) with a single byte representing the type of entry (write or delete), a 4 byte `uint32` for the length of the compressed block, and then the compressed block. + +### Cache + +The Cache is an in-memory copy of all data points current stored in the WAL. +The points are organized by the key, which is the measurement, [tag set](/influxdb/v1.3/concepts/glossary/#tag-set), and unique [field](/influxdb/v1.3/concepts/glossary/#field). +Each field is kept as its own time-ordered range. +The Cache data is not compressed while in memory. + +Queries to the storage engine will merge data from the Cache with data from the TSM files. +Queries execute on a copy of the data that is made from the cache at query processing time. +This way writes that come in while a query is running won't affect the result. + +Deletes sent to the Cache will clear out the given key or the specific time range for the given key. + +The Cache exposes a few controls for snapshotting behavior. +The two most important controls are the memory limits. +There is a lower bound, [`cache-snapshot-memory-size`](/influxdb/v1.3/administration/config/#cache-snapshot-memory-size-26214400), which when exceeded will trigger a snapshot to TSM files and remove the corresponding WAL segments. +There is also an upper bound, [`cache-max-memory-size`](/influxdb/v1.3/administration/config/#cache-max-memory-size-1073741824), which when exceeded will cause the Cache to reject new writes. +These configurations are useful to prevent out of memory situations and to apply back pressure to clients writing data faster than the instance can persist it. +The checks for memory thresholds occur on every write. + +The other snapshot controls are time based. +The idle threshold, [`cache-snapshot-write-cold-duration`](/influxdb/v1.3/administration/config/#cache-snapshot-write-cold-duration-10m), forces the Cache to snapshot to TSM files if it hasn't received a write within the specified interval. + +The in-memory Cache is recreated on restart by re-reading the WAL files on disk. + +### TSM Files + +TSM files are a collection of read-only files that are memory mapped. +The structure of these files looks very similar to an SSTable in LevelDB or other LSM Tree variants. + +A TSM file is composed of four sections: header, blocks, index, and footer. + +``` +┌────────┬────────────────────────────────────┬─────────────┬──────────────┐ +│ Header │ Blocks │ Index │ Footer │ +│5 bytes │ N bytes │ N bytes │ 4 bytes │ +└────────┴────────────────────────────────────┴─────────────┴──────────────┘ +``` + +The Header is a magic number to identify the file type and a version number. + +``` +┌───────────────────┐ +│ Header │ +├─────────┬─────────┤ +│ Magic │ Version │ +│ 4 bytes │ 1 byte │ +└─────────┴─────────┘ +``` + +Blocks are sequences of pairs of CRC32 checksums and data. +The block data is opaque to the file. +The CRC32 is used for block level error detection. +The length of the blocks is stored in the index. + +``` +┌───────────────────────────────────────────────────────────┐ +│ Blocks │ +├───────────────────┬───────────────────┬───────────────────┤ +│ Block 1 │ Block 2 │ Block N │ +├─────────┬─────────┼─────────┬─────────┼─────────┬─────────┤ +│ CRC │ Data │ CRC │ Data │ CRC │ Data │ +│ 4 bytes │ N bytes │ 4 bytes │ N bytes │ 4 bytes │ N bytes │ +└─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘ +``` + +Following the blocks is the index for the blocks in the file. +The index is composed of a sequence of index entries ordered lexicographically by key and then by time. +The key includes the measurement name, tag set, and one field. +Multiple fields per point creates multiple index entries in the TSM file. +Each index entry starts with a key length and the key, followed by the block type (float, int, bool, string) and a count of the number of index block entries that follow for that key. +Each index block entry is composed of the min and max time for the block, the offset into the file where the block is located and the size of the block. There is one index block entry for each block in the TSM file that contains the key. + +The index structure can provide efficient access to all blocks as well as the ability to determine the cost associated with accessing a given key. +Given a key and timestamp, we can determine whether a file contains the block for that timestamp. +We can also determine where that block resides and how much data must be read to retrieve the block. +Knowing the size of the block, we can efficiently provision our IO statements. + +``` +┌────────────────────────────────────────────────────────────────────────────┐ +│ Index │ +├─────────┬─────────┬──────┬───────┬─────────┬─────────┬────────┬────────┬───┤ +│ Key Len │ Key │ Type │ Count │Min Time │Max Time │ Offset │ Size │...│ +│ 2 bytes │ N bytes │1 byte│2 bytes│ 8 bytes │ 8 bytes │8 bytes │4 bytes │ │ +└─────────┴─────────┴──────┴───────┴─────────┴─────────┴────────┴────────┴───┘ +``` + +The last section is the footer that stores the offset of the start of the index. + +``` +┌─────────┐ +│ Footer │ +├─────────┤ +│Index Ofs│ +│ 8 bytes │ +└─────────┘ +``` + +### Compression + +Each block is compressed to reduce storage space and disk IO when querying. +A block contains the timestamps and values for a given series and field. +Each block has one byte header, followed by the compressed timestamps and then the compressed values. + +``` +┌───────┬─────┬─────────────────┬──────────────────┐ +│ Type │ Len │ Timestamps │ Values │ +│1 Byte │VByte│ N Bytes │ N Bytes │ +└───────┴─────┴─────────────────┴──────────────────┘ +``` + +The timestamps and values are compressed and stored separately using encodings dependent on the data type and its shape. +Storing them independently allows timestamp encoding to be used for all timestamps, while allowing different encodings for different field types. +For example, some points may be able to use run-length encoding whereas other may not. + +Each value type also contains a 1 byte header indicating the type of compression for the remaining bytes. +The four high bits store the compression type and the four low bits are used by the encoder if needed. + +#### Timestamps + +Timestamp encoding is adaptive and based on the structure of the timestamps that are encoded. +It uses a combination of delta encoding, scaling, and compression using simple8b run-length encoding, as well as falling back to no compression if needed. + +Timestamp resolution is variable but can be as granular as a nanosecond, requiring up to 8 bytes to store uncompressed. +During encoding, the values are first delta-encoded. +The first value is the starting timestamp and subsequent values are the differences from the prior value. +This usually converts the values into much smaller integers that are easier to compress. +Many timestamps are also monotonically increasing and fall on even boundaries of time such as every 10s. +When timestamps have this structure, they are scaled by the largest common divisor that is also a factor of 10. +This has the effect of converting very large integer deltas into smaller ones that compress even better. + +Using these adjusted values, if all the deltas are the same, the time range is stored using run-length encoding. +If run-length encoding is not possible and all values are less than (1 << 60) - 1 ([~36.5 years](https://www.wolframalpha.com/input/?i=\(1+%3C%3C+60\)+-+1+nanoseconds+to+years) at nanosecond resolution), then the timestamps are encoded using [simple8b encoding](https://github.com/jwilder/encoding/tree/master/simple8b). +Simple8b encoding is a 64bit word-aligned integer encoding that packs multiple integers into a single 64bit word. +If any value exceeds the maximum the deltas are stored uncompressed using 8 bytes each for the block. +Future encodings may use a patched scheme such as Patched Frame-Of-Reference (PFOR) to handle outliers more effectively. + +#### Floats + +Floats are encoded using an implementation of the [Facebook Gorilla paper](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf). +The encoding XORs consecutive values together to produce a small result when the values are close together. +The delta is then stored using control bits to indicate how many leading and trailing zeroes are in the XOR value. +Our implementation removes the timestamp encoding described in paper and only encodes the float values. + +#### Integers + +Integer encoding uses two different strategies depending on the range of values in the uncompressed data. +Encoded values are first encoded using [ZigZag encoding](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers). +This interleaves positive and negative integers across a range of positive integers. + +For example, [-2,-1,0,1] becomes [3,1,0,2]. +See Google's [Protocol Buffers documentation](https://developers.google.com/protocol-buffers/docs/encoding#signed-integers) for more information. + +If all ZigZag encoded values are less than (1 << 60) - 1, they are compressed using simple8b encoding. +If any values are larger than the maximum then all values are stored uncompressed in the block. +If all values are identical, run-length encoding is used. +This works very well for values that are frequently constant. + +#### Booleans + +Booleans are encoded using a simple bit packing strategy where each boolean uses 1 bit. +The number of booleans encoded is stored using variable-byte encoding at the beginning of the block. + +#### Strings +Strings are encoding using [Snappy](http://google.github.io/snappy/) compression. +Each string is packed consecutively and they are compressed as one larger block. + +### Compactions + +Compactions are recurring processes that migrate data stored in a write-optimized format into a more read-optimized format. +There are a number of stages of compaction that take place while a shard is hot for writes: + +* Snapshots - Values in the Cache and WAL must be converted to TSM files to free memory and disk space used by the WAL segments. +These compactions occur based on the cache memory and time thresholds. +* Level Compactions - Level compactions (levels 1-4) occur as the TSM files grow. +TSM files are compacted from snapshots to level 1 files. +Multiple level 1 files are compacted to produce level 2 files. +The process continues until files reach level 4 and the max size for a TSM file. +They will not be compacted further unless deletes, index optimization compactions, or full compactions need to run. +Lower level compactions use strategies that avoid CPU-intensive activities like decompressing and combining blocks. +Higher level (and thus less frequent) compactions will re-combine blocks to fully compact them and increase the compression ratio. +* Index Optimization - When many level 4 TSM files accumulate, the internal indexes become larger and more costly to access. +An index optimization compaction splits the series and indices across a new set of TSM files, sorting all points for a given series into one TSM file. +Before an index optimization, each TSM file contained points for most or all series, and thus each contains the same series index. +After an index optimization, each TSM file contains points from a minimum of series and there is little series overlap between files. +Each TSM file thus has a smaller unique series index, instead of a duplicate of the full series list. +In addition, all points from a particular series are contiguous in a TSM file rather than spread across multiple TSM files. +* Full Compactions - Full compactions run when a shard has become cold for writes for long time, or when deletes have occurred on the shard. +Full compactions produce an optimal set of TSM files and include all optimizations from Level and Index Optimization compactions. +Once a shard is fully compacted, no other compactions will run on it unless new writes or deletes are stored. + +### Writes + +Writes are appended to the current WAL segment and are also added to the Cache. +Each WAL segment has a maximum size. +Writes roll over to a new file once the current file fills up. +The cache is also size bounded; snapshots are taken and WAL compactions are initiated when the cache becomes too full. +If the inbound write rate exceeds the WAL compaction rate for a sustained period, the cache may become too full, in which case new writes will fail until the snapshot process catches up. + +When WAL segments fill up and are closed, the Compactor snapshots the Cache and writes the data to a new TSM file. +When the TSM file is successfully written and `fsync`'d, it is loaded and referenced by the FileStore. + +### Updates + +Updates (writing a newer value for a point that already exists) occur as normal writes. +Since cached values overwrite existing values, newer writes take precedence. +If a write would overwrite a point in a prior TSM file, the points are merged at query runtime and the newer write takes precedence. + + +### Deletes + +Deletes occur by writing a delete entry to the WAL for the measurement or series and then updating the Cache and FileStore. +The Cache evicts all relevant entries. +The FileStore writes a tombstone file for each TSM file that contains relevant data. +These tombstone files are used at startup time to ignore blocks as well as during compactions to remove deleted entries. + +Queries against partially deleted series are handled at query time until a compaction removes the data fully from the TSM files. + +### Queries + +When a query is executed by the storage engine, it is essentially a seek to a given time associated with a specific series key and field. +First, we do a search on the data files to find the files that contain a time range matching the query as well containing matching series. + +Once we have the data files selected, we next need to find the position in the file of the series key index entries. +We run a binary search against each TSM index to find the location of its index blocks. + +In common cases the blocks will not overlap across multiple TSM files and we can search the index entries linearly to find the start block from which to read. +If there are overlapping blocks of time, the index entries are sorted to ensure newer writes will take precedence and that blocks can be processed in order during query execution. + +When iterating over the index entries the blocks are read sequentially from the blocks section. +The block is decompressed and we seek to the specific point. + + +# The new InfluxDB storage engine: from LSM Tree to B+Tree and back again to create the Time Structured Merge Tree + +Writing a new storage format should be a last resort. +So how did InfluxData end up writing our own engine? +InfluxData has experimented with many storage formats and found each lacking in some fundamental way. +The performance requirements for InfluxDB are significant, and eventually overwhelm other storage systems. +The 0.8 line of InfluxDB allowed multiple storage engines, including LevelDB, RocksDB, HyperLevelDB, and LMDB. +The 0.9 line of InfluxDB used BoltDB as the underlying storage engine. +This writeup is about the Time Structured Merge Tree storage engine that was released in 0.9.5 and is the only storage engine supported in InfluxDB 0.11+, including the entire 1.x family. + +The properties of the time series data use case make it challenging for many existing storage engines. +Over the course of InfluxDB's development we've tried a few of the more popular options. +We started with LevelDB, an engine based on LSM Trees, which are optimized for write throughput. +After that we tried BoltDB, an engine based on a memory mapped B+Tree, which is optimized for reads. +Finally, we ended up building our own storage engine that is similar in many ways to LSM Trees. + +With our new storage engine we were able to achieve up to a 45x reduction in disk space usage from our B+Tree setup with even greater write throughput and compression than what we saw with LevelDB and its variants. +This post will cover the details of that evolution and end with an in-depth look at our new storage engine and its inner workings. + +## Properties of Time Series Data + +The workload of time series data is quite different from normal database workloads. +There are a number of factors that conspire to make it very difficult to scale and remain performant: + +* Billions of individual data points +* High write throughput +* High read throughput +* Large deletes (data expiration) +* Mostly an insert/append workload, very few updates + +The first and most obvious problem is one of scale. +In DevOps, IoT, or APM it is easy to collect hundreds of millions or billions of unique data points every day. + +For example, let's say we have 200 VMs or servers running, with each server collecting an average of 100 measurements every 10 seconds. +Given there are 86,400 seconds in a day, a single measurement will generate 8,640 points in a day, per server. +That gives us a total of 200 * 100 * 8,640 = 172,800,000 individual data points per day. +We find similar or larger numbers in sensor data use cases. + +The volume of data means that the write throughput can be very high. +We regularly get requests for setups than can handle hundreds of thousands of writes per second. +Some larger companies will only consider systems that can handle millions of writes per second. + +At the same time, time series data can be a high read throughput use case. +It's true that if you're tracking 700,000 unique metrics or time series you can't hope to visualize all of them. +That leads many people to think that you don't actually read most of the data that goes into the database. +However, other than dashboards that people have up on their screens, there are automated systems for monitoring or combining the large volume of time series data with other types of data. + +Inside InfluxDB, aggregate functions calculated on the fly may combine tens of thousands of distinct time series into a single view. +Each one of those queries must read each aggregated data point, so for InfluxDB the read throughput is often many times higher than the write throughput. + +Given that time series is mostly an append-only workload, you might think that it's possible to get great performance on a B+Tree. +Appends in the keyspace are efficient and you can achieve greater than 100,000 per second. +However, we have those appends happening in individual time series. +So the inserts end up looking more like random inserts than append only inserts. + +One of the biggest problems we found with time series data is that it's very common to delete all data after it gets past a certain age. +The common pattern here is that users have high precision data that is kept for a short period of time like a few days or months. +Users then downsample and aggregate that data into lower precision rollups that are kept around much longer. + +The naive implementation would be to simply delete each record once it passes its expiration time. +However, that means that once the first points written reach their expiration date, the system is processing just as many deletes as writes, which is something most storage engines aren't designed for. + +Let's dig into the details of the two types of storage engines we tried and how these properties had a significant impact on our performance. + +## LevelDB and Log Structured Merge Trees + +When the InfluxDB project began, we picked LevelDB as the storage engine because we had used it for time series data storage in the product that was the precursor to InfluxDB. +We knew that it had great properties for write throughput and everything seemed to "just work". + +LevelDB is an implementation of a Log Structured Merge Tree (or LSM Tree) that was built as an open source project at Google. +It exposes an API for a key/value store where the key space is sorted. +This last part is important for time series data as it allowed us to quickly scan ranges of time as long as the timestamp was in the key. + +LSM Trees are based on a log that takes writes and two structures known as Mem Tables and SSTables. +These tables represent the sorted keyspace. +SSTables are read only files that are continuously replaced by other SSTables that merge inserts and updates into the keyspace. + +The two biggest advantages that LevelDB had for us were high write throughput and built in compression. +However, as we learned more about what people needed with time series data, we encountered a few insurmountable challenges. + +The first problem we had was that LevelDB doesn't support hot backups. +If you want to do a safe backup of the database, you have to close it and then copy it. +The LevelDB variants RocksDB and HyperLevelDB fix this problem, but there was another more pressing problem that we didn't think they could solve. + +Our users needed a way to automatically manage data retention. +That meant we needed deletes on a very large scale. +In LSM Trees, a delete is as expensive, if not more so, than a write. +A delete writes a new record known as a tombstone. +After that queries merge the result set with any tombstones to purge the deleted data from the query return. +Later, a compaction runs that removes the tombstone record and the underlying deleted record in the SSTable file. + +To get around doing deletes, we split data across what we call shards, which are contiguous blocks of time. +Shards would typically hold either one day or seven days worth of data. +Each shard mapped to an underlying LevelDB. +This meant that we could drop an entire day of data by just closing out the database and removing the underlying files. + +Users of RocksDB may at this point bring up a feature called ColumnFamilies. +When putting time series data into Rocks, it's common to split blocks of time into column families and then drop those when their time is up. +It's the same general idea: create a separate area where you can just drop files instead of updating indexes when you delete a large block of data. +Dropping a column family is a very efficient operation. +However, column families are a fairly new feature and we had another use case for shards. + +Organizing data into shards meant that it could be moved within a cluster without having to examine billions of keys. +At the time of this writing, it was not possible to move a column family in one RocksDB to another. +Old shards are typically cold for writes so moving them around would be cheap and easy. +We would have the added benefit of having a spot in the keyspace that is cold for writes so it would be easier to do consistency checks later. + +The organization of data into shards worked great for a while, until a large amount of data went into InfluxDB. +LevelDB splits the data out over many small files. +Having dozens or hundreds of these databases open in a single process ended up creating a big problem. +Users that had six months or a year of data would run out of file handles. +It's not something we found with the majority of users, but anyone pushing the database to its limits would hit this problem and we had no fix for it. +There were simply too many file handles open. + +## BoltDB and mmap B+Trees + +After struggling with LevelDB and its variants for a year we decided to move over to BoltDB, a pure Golang database heavily inspired by LMDB, a mmap B+Tree database written in C. +It has the same API semantics as LevelDB: a key value store where the keyspace is ordered. +Many of our users were surprised. +Our own posted tests of the LevelDB variants vs. LMDB (a mmap B+Tree) showed RocksDB as the best performer. + +However, there were other considerations that went into this decision outside of the pure write performance. +At this point our most important goal was to get to something stable that could be run in production and backed up. +BoltDB also had the advantage of being written in pure Go, which simplified our build chain immensely and made it easy to build for other OSes and platforms. + +The biggest win for us was that BoltDB used a single file as the database. +At this point our most common source of bug reports were from people running out of file handles. +Bolt solved the hot backup problem and the file limit problems all at the same time. + +We were willing to take a hit on write throughput if it meant that we'd have a system that was more reliable and stable that we could build on. +Our reasoning was that for anyone pushing really big write loads, they'd be running a cluster anyway. + +We released versions 0.9.0 to 0.9.2 based on BoltDB. +From a development perspective it was delightful. +Clean API, fast and easy to build in our Go project, and reliable. +However, after running for a while we found a big problem with write throughput. +After the database got over a few GB, writes would start spiking IOPS. + +Some users were able to get past this by putting InfluxDB on big hardware with near unlimited IOPS. +However, most users are on VMs with limited resources in the cloud. +We had to figure out a way to reduce the impact of writing a bunch of points into hundreds of thousands of series at a time. + +With the 0.9.3 and 0.9.4 releases our plan was to put a write ahead log (WAL) in front of Bolt. +That way we could reduce the number of random insertions into the keyspace. +Instead, we'd buffer up multiple writes that were next to each other and then flush them at once. +However, that only served to delay the problem. +High IOPS still became an issue and it showed up very quickly for anyone operating at even moderate work loads. + +However, our experience building the first WAL implementation in front of Bolt gave us the confidence we needed that the write problem could be solved. +The performance of the WAL itself was fantastic, the index simply could not keep up. +At this point we started thinking again about how we could create something similar to an LSM Tree that could keep up with our write load. + +Thus was born the Time Structured Merge Tree. diff --git a/content/influxdb/v1.3/data_sources/carbon.md b/content/influxdb/v1.3/data_sources/carbon.md new file mode 100644 index 000000000..e69de29bb diff --git a/content/influxdb/v1.3/data_sources/collectd.md b/content/influxdb/v1.3/data_sources/collectd.md new file mode 100644 index 000000000..e69de29bb diff --git a/content/influxdb/v1.3/data_sources/diamond.md b/content/influxdb/v1.3/data_sources/diamond.md new file mode 100644 index 000000000..2b7e6cdff --- /dev/null +++ b/content/influxdb/v1.3/data_sources/diamond.md @@ -0,0 +1,22 @@ +--- +title: Diamond +--- + +## Saving Diamond Metrics into InfluxDB + +Diamond is a metrics collection and delivery daemon written in Python. +It is capable of collecting cpu, memory, network, i/o, load and disk metrics. +Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source. + +[Diamond homepage](https://github.com/python-diamond) + +Diamond started supporting InfluxDB at version 3.5. + +## Configuring Diamond to send metrics to InfluxDB + +Prerequisites: Diamond depends on the [influxdb python client](https://github.com/influxdb/influxdb-python). +InfluxDB-version-specific installation instructions for the influxdb python client can be found on their [github page](https://github.com/influxdb/influxdb-python). + + +[Diamond InfluxdbHandler configuration page](https://github.com/python-diamond/Diamond/wiki/handler-InfluxdbHandler) + diff --git a/content/influxdb/v1.3/data_sources/opentsdb.md b/content/influxdb/v1.3/data_sources/opentsdb.md new file mode 100644 index 000000000..cff0d93bb --- /dev/null +++ b/content/influxdb/v1.3/data_sources/opentsdb.md @@ -0,0 +1,19 @@ +--- +title: OpenTSDB +--- + +InfluxDB supports the OpenTSDB ["telnet" protocol](http://opentsdb.net/docs/build/html/user_guide/writing.html#telnet). +When OpenTSDB support is enabled, InfluxDB can act as a drop-in replacement for your OpenTSDB system. + +An example input point, and how it is processed, is shown below. + +``` +put sys.cpu.user 1356998400 42.5 host=webserver01 cpu=0 +``` + +When InfluxDB receives this data, a point is written to the database. +The point's Measurement is `sys.cpu.user`, the timestamp is `1356998400`, and the value is `42.5`. +The point is also tagged with `host=webserver01` and `cpu=0`. +Tags allow fast and efficient queries to be performed on your data. + +To learn more about enabling OpenTSDB support, check the example [configuration file](https://github.com/influxdb/influxdb/blob/1.3/etc/config.sample.toml). diff --git a/content/influxdb/v1.3/external_resources.md b/content/influxdb/v1.3/external_resources.md new file mode 100644 index 000000000..1270ce0a7 --- /dev/null +++ b/content/influxdb/v1.3/external_resources.md @@ -0,0 +1,35 @@ +--- +title: External resources +menu: + influxdb_1_3: + weight: 200 +--- + +But wait, there's more! +Check out these resources to learn more about InfluxDB. + +## [InfluxData Blog](https://www.influxdata.com/blog/) + +Check out the InfluxData Blog for announcements, updates, and +weekly [tech tips](https://www.influxdata.com/category/tech-tips/). + +## [Technical Papers](https://www.influxdata.com/_resources/techpapers-new/) + +InfluxData's Technical Papers series offer in-depth analysis on performance, time series, +and benchmarking InfluxDB vs. other popular databases. + +## [Meetup Videos](https://www.influxdata.com/_resources/videosnew//) + +Check out our growing Meetup videos collection for introductory content, how-tos, and more. + +## [Virtual Training Videos](https://www.influxdata.com/_resources/videosnew/) + +Watch the videos from our weekly training webinar. + +## [Virtual Training Schedule](https://www.influxdata.com/virtual-training-courses/) + +Check out our virtual training schedule to register for future webinars. + +## [Events](https://www.influxdata.com/events/) + +Find out what's happening at InfluxDB and sign up for upcoming events. diff --git a/content/influxdb/v1.3/guides/_index.md b/content/influxdb/v1.3/guides/_index.md new file mode 100644 index 000000000..36517bf62 --- /dev/null +++ b/content/influxdb/v1.3/guides/_index.md @@ -0,0 +1,17 @@ +--- +title: InfluxDB guides +menu: + influxdb_1_3: + name: Guides + weight: 40 +--- + +## [Writing Data](/influxdb/v1.3/guides/writing_data/) + +## [Querying Data](/influxdb/v1.3/guides/querying_data/) + +## [Downsampling and Data Retention](/influxdb/v1.3/guides/downsampling_and_retention/) + +## [Hardware Sizing Guidelines](/influxdb/v1.3/guides/hardware_sizing/) + +## [HTTPS Setup](/influxdb/v1.3/administration/https_setup/) diff --git a/content/influxdb/v1.3/guides/downsampling_and_retention.md b/content/influxdb/v1.3/guides/downsampling_and_retention.md new file mode 100644 index 000000000..34dff8c36 --- /dev/null +++ b/content/influxdb/v1.3/guides/downsampling_and_retention.md @@ -0,0 +1,228 @@ +--- +title: Downsampling and Data Retention +menu: + influxdb_1_3: + weight: 11 + parent: Guides +--- + +InfluxDB can handle hundreds of thousands of data points per second. +Working with that much data over a long period of time can create storage +concerns. +A natural solution is to downsample the data; keep the high precision raw data +for only a limited time, and store the lower precision, summarized data for much +longer or forever. + +InfluxDB offers two features - Continuous Queries (CQ) and Retention Policies +(RP) - that automate the process of downsampling data and expiring old data. +This guide describes a practical use case for CQs and RPs and covers how to +set up those features in InfluxDB. + +### Definitions + +A **Continuous Query** (CQ) is an InfluxQL query that runs automatically and +periodically within a database. +CQs require a function in the `SELECT` clause and must include a +`GROUP BY time()` clause. + +A **Retention Policy** (RP) is the part of InfluxDB's data structure +that describes for how long InfluxDB keeps data. +InfluxDB compares your local server's timestamp to the timestamps on your data +and deletes data that are older than the RP's `DURATION`. +A single database can have several RPs and RPs are unique per database. + +This guide will not go into detail about the syntax for creating and managing +CQs and RPs. +If you're new to both concepts, we recommend looking over the detailed +[CQ documentation](/influxdb/v1.3/query_language/continuous_queries/) and +[RP documentation](/influxdb/v1.3/query_language/database_management/#retention-policy-management). + +### Sample data +This section uses fictional real-time data that track the number of food orders +to a restaurant via phone and via website at ten second intervals. +We will store those data in a +[database](/influxdb/v1.3/concepts/glossary/#database) called `food_data`, in +the [measurement](/influxdb/v1.3/concepts/glossary/#measurement) `orders`, and +in the [fields](/influxdb/v1.3/concepts/glossary/#field) `phone` and `website`. + +Sample: +``` +name: orders +------------ +time phone website +2016-05-10T23:18:00Z 10 30 +2016-05-10T23:18:10Z 12 39 +2016-05-10T23:18:20Z 11 56 +``` + +### Goal +Assume that, in the long run, we're only interested in the average number of orders by phone +and by website at 30 minute intervals. +In the next steps, we use RPs and CQs to: + + * Automatically aggregate the ten-second resolution data to 30-minute resolution data + * Automatically delete the raw, ten-second resolution data that are older than two hours + * Automatically delete the 30-minute resolution data that are older than 52 weeks + +### Database Preparation +We perform the following steps before writing the data to the database +`food_data`. +We do this **before** inserting any data because CQs only run against recent +data; that is, data with timestamps that are no older than `now()` minus +the `FOR` clause of the CQ, or `now()` minus the `GROUP BY time()` interval if +the CQ has no `FOR` clause. + +#### 1. Create the database + +``` +> CREATE DATABASE "food_data" +``` + +#### 2. Create a two-hour `DEFAULT` RP + +InfluxDB writes to the `DEFAULT` RP if we do not supply an explicit RP when +writing a point to the database. +We make the `DEFAULT` RP keep data for two hours, because we want InfluxDB to +automatically write the incoming ten-second resolution data to that RP. + +Use the +[`CREATE RETENTION POLICY`](/influxdb/v1.3/query_language/database_management/#create-retention-policies-with-create-retention-policy) +statement to create a `DEFAULT` RP: + +```sql +> CREATE RETENTION POLICY "two_hours" ON "food_data" DURATION 2h REPLICATION 1 DEFAULT +``` + +That query creates an RP called `two_hours` that exists in the database +`food_data`. +`two_hours` keeps data for a `DURATION` of two hours (`2h`) and it's the `DEFAULT` +RP for the database `food_data`. + +{{% warn %}} +The replication factor (`REPLICATION 1`) is a required parameter but must always +be set to 1 for single node instances. +{{% /warn %}} + +> **Note:** When we created the `food_data` database in step 1, InfluxDB +automatically generated an RP named `autogen` and set it as the `DEFAULT` +RP for the database. +The `autogen` RP has an infinite retention period. +With the query above, the RP `two_hours` replaces `autogen` as the `DEFAULT` RP +for the `food_data` database. + +#### 3. Create a 52-week RP + +Next we want to create another RP that keeps data for 52 weeks and is not the +`DEFAULT` RP for the database. +Ultimately, the 30-minute rollup data will be stored in this RP. + +Use the +[`CREATE RETENTION POLICY`](/influxdb/v1.3/query_language/database_management/#create-retention-policies-with-create-retention-policy) +statement to create a non-`DEFAULT` RP: + +```sql +> CREATE RETENTION POLICY "a_year" ON "food_data" DURATION 52w REPLICATION 1 +``` + +That query creates an RP called `a_year` that exists in the database +`food_data`. +`a_year` keeps data for a `DURATION` of 52 weeks (`52w`). +Leaving out the `DEFAULT` argument ensures that `a_year` is not the `DEFAULT` +RP for the database `food_data`. +That is, write and read operations against `food_data` that do not specify an +RP will still go to the `two_hours` RP (the `DEFAULT` RP). + +#### 4. Create the CQ + +Now that we've set up our RPs, we want to create a CQ that will automatically +and periodically downsample the ten-second resolution data to the 30-minute +resolution, and store those results in a different measurement with a different +retention policy. + +Use the +[`CREATE CONTINUOUS QUERY`](/influxdb/v1.3/query_language/continuous_queries/) +statement to generate a CQ: + +```sql +> CREATE CONTINUOUS QUERY "cq_30m" ON "food_data" BEGIN + SELECT mean("website") AS "mean_website",mean("phone") AS "mean_phone" + INTO "a_year"."downsampled_orders" + FROM "orders" + GROUP BY time(30m) +END +``` + +That query creates a CQ called `cq_30m` in the database `food_data`. +`cq_30m` tells InfluxDB to calculate the 30-minute average of the two fields +`website` and `phone` in the measurement `orders` and in the `DEFAULT` RP +`two_hours`. +It also tells InfluxDB to write those results to the measurement +`downsampled_orders` in the retention policy `a_year` with the field keys +`mean_website` and `mean_phone`. +InfluxDB will run this query every 30 minutes for the previous 30 minutes. + +> **Note:** Notice that we fully qualify (that is, we use the syntax +`"".""`) the measurement in the `INTO` +clause. +InfluxDB requires that syntax to write data to an RP other than the `DEFAULT` +RP. + +### Results + +With the new CQ and two new RPs, `food_data` is ready to start receiving data. +After writing data to our database and letting things run for a bit, we see +two measurements: `orders` and `downsampled_orders`. + +```bash +> SELECT * FROM "orders" LIMIT 5 +name: orders +--------- +time phone website +2016-05-13T23:00:00Z 10 30 +2016-05-13T23:00:10Z 12 39 +2016-05-13T23:00:20Z 11 56 +2016-05-13T23:00:30Z 8 34 +2016-05-13T23:00:40Z 17 32 + +> SELECT * FROM "a_year"."downsampled_orders" LIMIT 5 +name: downsampled_orders +--------------------- +time mean_phone mean_website +2016-05-13T15:00:00Z 12 23 +2016-05-13T15:30:00Z 13 32 +2016-05-13T16:00:00Z 19 21 +2016-05-13T16:30:00Z 3 26 +2016-05-13T17:00:00Z 4 23 +``` + +The data in `orders` are the raw, ten-second resolution data that reside in the +two-hour RP. +The data in `downsampled_orders` are the aggregated, 30-minute resolution data +that are subject to the 52-week RP. + +Notice that the first timestamps in `downsampled_orders` are older than the first +timestamps in `orders`. +This is because InfluxDB has already deleted data from `orders` with timestamps +that are older than our local server's timestamp minus two hours (assume we + executed the `SELECT` queries at `2016-05-13T00:59:59Z`). +InfluxDB will only start dropping data from `downsampled_orders` after 52 weeks. + +> **Notes:** +> +* Notice that we fully qualify (that is, we use the syntax +`"".""`) `downsampled_orders` in +the second `SELECT` statement. We must specify the RP in that query to `SELECT` +data that reside in an RP other than the `DEFAULT` RP. +> +* By default, InfluxDB checks to enforce an RP every 30 minutes. +Between checks, `orders` may have data that are older than two hours. +The rate at which InfluxDB checks to enforce an RP is a configurable setting, +see +[Database Configuration](/influxdb/v1.3/administration/config/#check-interval-30m0s). + +Using a combination of RPs and CQs, we've successfully set up our database to +automatically keep the high precision raw data for a limited time, create lower +precision data, and store that lower precision data for a longer period of time. +Now that you have a general understanding of how these features can work +together, we recommend looking at the detailed documentation on [CQs](/influxdb/v1.3/query_language/continuous_queries/) and [RPs](/influxdb/v1.3/query_language/database_management/#retention-policy-management) +to see all that they can do for you. diff --git a/content/influxdb/v1.3/guides/hardware_sizing.md b/content/influxdb/v1.3/guides/hardware_sizing.md new file mode 100644 index 000000000..ba846ae53 --- /dev/null +++ b/content/influxdb/v1.3/guides/hardware_sizing.md @@ -0,0 +1,173 @@ +--- +title: Hardware Sizing Guidelines +menu: + influxdb_1_3: + weight: 12 + parent: Guides +--- + +This guide offers general hardware recommendations for InfluxDB and addresses some frequently asked questions about hardware sizing. The recommendations are only for the [Time Structured Merge](/influxdb/v1.3/concepts/storage_engine/#the-new-influxdb-storage-engine-from-lsm-tree-to-b-tree-and-back-again-to-create-the-time-structured-merge-tree) tree (`TSM`) storage engine, the only storage engine available with InfluxDB 1.3. Users running older versions of InfluxDB with [unconverted](/influxdb/v0.10/administration/upgrading/#convert-b1-and-bz1-shards-to-tsm1) `b1` or `bz1` shards may have different performance characteristics. See the [InfluxDB 0.9 sizing guide](/influxdb/v0.9/guides/hardware_sizing/) for more detail. + +* [Single node or Cluster?](/influxdb/v1.3/guides/hardware_sizing/#single-node-or-cluster) +* [General hardware guidelines for a single node](/influxdb/v1.3/guides/hardware_sizing/#general-hardware-guidelines-for-a-single-node) +* [General hardware guidelines for a cluster](/influxdb/v1.3/guides/hardware_sizing/#general-hardware-guidelines-for-a-cluster) +* [When do I need more RAM?](/influxdb/v1.3/guides/hardware_sizing/#when-do-i-need-more-ram) +* [What kind of storage do I need?](/influxdb/v1.3/guides/hardware_sizing/#what-kind-of-storage-do-i-need) +* [How much storage do I need?](/influxdb/v1.3/guides/hardware_sizing/#how-much-storage-do-i-need) +* [How should I configure my hardware?](/influxdb/v1.3/guides/hardware_sizing/#how-should-i-configure-my-hardware) + +# Single node or Cluster? +InfluxDB single node instances are fully open source. +InfluxDB clustering requires our closed-source commercial product. +Single node instances offer no redundancy. If the server is unavailable, writes and queries will fail immediately. +Clustering offers high-availability and redundancy. +Multiple copies of data are distributed across multiple servers, and the loss of any one server will not significantly impact the cluster. + +If your performance requirements fall into the [Moderate](#general-hardware-guidelines-for-a-single-node) or [Low load](#general-hardware-guidelines-for-a-single-node) ranges then you can likely use a single node instance of InfluxDB. +If at least one of your performance requirements falls into the [Probably infeasible category](#general-hardware-guidelines-for-a-single-node), then you will likely need to use a cluster to distribute the load among multiple servers. + +# General hardware guidelines for a single node + +We define the load that you'll be placing on InfluxDB by the number of fields written per second, the number of queries per second, and the number of unique [series](/influxdb/v1.3/concepts/glossary/#series). Based on your load, we make general CPU, RAM, and IOPS recommendations. + +InfluxDB should be run on locally attached SSDs. Any other storage configuration will have lower performance characteristics and may not be able to recover from even small interruptions in normal processing. + +| Load | Field writes per second | Moderate queries per second | Unique series | +|--------------|----------------|----------------|---------------| +| **Low** | < 5 thousand | < 5 | < 100 thousand | +| **Moderate** | < 250 thousand | < 25 | < 1 million | +| **High** | > 250 thousand | > 25 | > 1 million | +| **Probably infeasible** | > 750 thousand | > 100 | > 10 million | + +> **Note:** Queries vary widely in their impact on the system. +> +Simple queries: +> +* Have few if any functions and no regular expressions +* Are bounded in time to a few minutes, hours, or maybe a day +* Typically execute in a few milliseconds to a few dozen milliseconds +> +Moderate queries: +> +* Have multiple functions and one or two regular expressions +* May also have complex `GROUP BY` clauses or sample a time range of multiple weeks +* Typically execute in a few hundred or a few thousand milliseconds +> +Complex queries: +> +* Have multiple aggregation or transformation functions or multiple regular expressions +* May sample a very large time range of months or years +* Typically take multiple seconds to execute + + + +### Low load recommendations +* CPU: 2-4 cores +* RAM: 2-4 GB +* IOPS: 500 + +### Moderate load recommendations +* CPU: 4-6 cores +* RAM: 8-32 GB +* IOPS: 500-1000 + +### High load recommendations +* CPU: 8+ cores +* RAM: 32+ GB +* IOPS: 1000+ + +### Probably infeasible load +Performance at this scale is a significant challenge and may not be achievable. Please contact us at for assistance with tuning your systems. + + +# General hardware guidelines for a cluster + +## Meta Nodes +A cluster must have at least three independent meta nodes to survive the loss of a server. A cluster with `2n + 1` meta nodes can tolerate the loss of `n` meta nodes. Clusters should have an odd number of meta nodes. There is no reason to have an even number of meta nodes, and it can lead to issues in certain configurations. + +Meta nodes do not need very much computing power. Regardless of the cluster load, we recommend the following for the meta nodes: + +### Universal recommendation +* CPU: 1-2 cores +* RAM: 512 MB - 1 GB +* IOPS: 50 + +## Data Nodes +A cluster with only one data node is valid but has no data redundancy. The redundancy is set by the [replication factor](/influxdb/v0.13/concepts/glossary/#replication-factor) on the retention policy to which the data is written. A cluster can lose `n - 1` data nodes and still return complete query results, where `n` is the replication factor. For optimal data distribution within the cluster, InfluxData recommends using an even number of data nodes. + +The hardware recommendations for cluster data nodes are similar to the standalone instance recommendations. Data nodes should always have at least 2 CPU cores, as they must handle regular read and write traffic, as well as intra-cluster read and write traffic. Due to the cluster communication overhead, data nodes in a cluster handle less throughput than a standalone instance on the same hardware. + +| Load | Field writes per second per node | Moderate queries per second per node | Unique series per node | +|--------------|----------------|----------------|---------------| +| **Low** | < 5 thousand | < 5 | < 100 thousand | +| **Moderate** | < 100 thousand | < 25 | < 1 million | +| **High** | > 100 thousand | > 25 | > 1 million | +| **Probably infeasible** | > 500 thousand | > 100 | > 10 million | + +> **Note:** Queries vary widely in their impact on the system. +> +Simple queries: +> +* Have few if any functions and no regular expressions +* Are bounded in time to a few minutes, hours, or maybe a day +* Typically execute in a few milliseconds to a few dozen milliseconds +> +Moderate queries: +> +* Have multiple functions and one or two regular expressions +* May also have complex `GROUP BY` clauses or sample a time range of multiple weeks +* Typically execute in a few hundred or a few thousand milliseconds +> +Complex queries: +> +* Have multiple aggregation or transformation functions or multiple regular expressions +* May sample a very large time range of months or years +* Typically take multiple seconds to execute + +### Low load recommendations +* CPU: 2 cores +* RAM: 2-4 GB +* IOPS: 1000 + +### Moderate load recommendations +* CPU: 4-6 +* RAM: 8-32GB +* IOPS: 1000+ + +### High load recommendations +* CPU: 8+ +* RAM: 32+ GB +* IOPS: 1000+ + +## Enterprise Web Node +The Enterprise Web server is primarily an HTTP server with similar load requirements. For most applications it does not need to be very robust. The cluster will function with only one Web server, but for redundancy multiple Web servers can be connected to a single back-end Postgres database. + +> **Note:** Production clusters should not use the SQLite database as it does not allow for redundant Web servers, nor can it handle high loads as gracefully as Postgres. + +### Universal recommendation +* CPU: 1-4 cores +* RAM: 1-2 GB +* IOPS: 50 + +# When do I need more RAM? +In general, having more RAM helps queries return faster. There is no known downside to adding more RAM. + +The major component that affects your RAM needs is [series cardinality](/influxdb/v1.3/concepts/glossary/#series-cardinality). +A series cardinality around or above 10 million can cause OOM failures even with large amounts of RAM. If this is the case, you can usually address the problem by redesigning your [schema](/influxdb/v1.3/concepts/glossary/#schema). + +The increase in RAM needs relative to series cardinality is exponential where the exponent is between one and two: + +![Series Cardinality](/img/influxdb/series-cardinality.png) + +# What kind of storage do I need? +InfluxDB is designed to run on SSDs. InfluxData does not test on HDDs or networked storage devices, and we do not recommend them for production. Performance is an order of magnitude lower on spinning disk drives and the system may break down under even moderate load. For best results InfluxDB servers must have at least 1000 IOPS on the storage system. + +Please note that cluster data nodes have very high IOPS requirements when the cluster is recovering from downtime. It is recommended that the storage system have at least 2000 IOPS to allow for rapid recovery. Below 1000 IOPS, the cluster may not be able to recover from even a brief outage. + +# How much storage do I need? +Database names, [measurements](/influxdb/v1.3/concepts/glossary/#measurement), [tag keys](/influxdb/v1.3/concepts/glossary/#tag-key), [field keys](/influxdb/v1.3/concepts/glossary/#field-key), and [tag values](/influxdb/v1.3/concepts/glossary/#tag-value) are stored only once and always as strings. Only [field values](/influxdb/v1.3/concepts/glossary/#field-value) and [timestamps](/influxdb/v1.3/concepts/glossary/#timestamp) are stored per-point. + +Non-string values require approximately three bytes. String values require variable space as determined by string compression. + +# How should I configure my hardware? +When running InfluxDB in a production environment the `wal` directory and the `data` directory should be on separate storage devices. This optimization significantly reduces disk contention when the system is under heavy write load. This is an important consideration if the write load is highly variable. If the write load does not vary by more than 15% the optimization is probably unneeded. diff --git a/content/influxdb/v1.3/guides/querying_data.md b/content/influxdb/v1.3/guides/querying_data.md new file mode 100644 index 000000000..c501f7927 --- /dev/null +++ b/content/influxdb/v1.3/guides/querying_data.md @@ -0,0 +1,159 @@ +--- +title: Querying Data with the HTTP API +alias: + -/docs/v1.3/query_language/querying_data/ +menu: + influxdb_1_3: + weight: 10 + parent: Guides +--- + +## Querying data using the HTTP API +The HTTP API is the primary means for querying data in InfluxDB (see the [command line interface](/influxdb/v1.3/tools/shell/) and [client libraries](/influxdb/v1.3/tools/api_client_libraries/) for alternative ways to query the database). + +To perform a query send a `GET` request to the `/query` endpoint, set the URL parameter `db` as the target database, and set the URL parameter `q` as your query. +You may also use a `POST` request by sending the same parameters either as URL parameters or as part of the body with `application/x-www-form-urlencoded`. +The example below uses the HTTP API to query the same database that you encountered in [Writing Data](/influxdb/v1.3/guides/writing_data/). +
+```bash +curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM \"cpu_load_short\" WHERE \"region\"='us-west'" +``` + +InfluxDB returns JSON. +The results of your query appear in the `"results"` array. +If an error occurs, InfluxDB sets an `"error"` key with an explanation of the error. +
+ +``` +{ + "results": [ + { + "statement_id": 0, + "series": [ + { + "name": "cpu_load_short", + "columns": [ + "time", + "value" + ], + "values": [ + [ + "2015-01-29T21:55:43.702900257Z", + 2 + ], + [ + "2015-01-29T21:55:43.702900257Z", + 0.55 + ], + [ + "2015-06-11T20:46:02Z", + 0.64 + ] + ] + } + ] + } + ] +} +``` + +> **Note:** Appending `pretty=true` to the URL enables pretty-printed JSON output. +While this is useful for debugging or when querying directly with tools like `curl`, it is not recommended for production use as it consumes unnecessary network bandwidth. + +### Multiple queries +--- +Send multiple queries to InfluxDB in a single API call. +Simply delimit each query using a semicolon, for example: +
+```bash +curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM \"cpu_load_short\" WHERE \"region\"='us-west';SELECT count(\"value\") FROM \"cpu_load_short\" WHERE \"region\"='us-west'" +``` + +returns: +
+``` +{ + "results": [ + { + "statement_id": 0, + "series": [ + { + "name": "cpu_load_short", + "columns": [ + "time", + "value" + ], + "values": [ + [ + "2015-01-29T21:55:43.702900257Z", + 2 + ], + [ + "2015-01-29T21:55:43.702900257Z", + 0.55 + ], + [ + "2015-06-11T20:46:02Z", + 0.64 + ] + ] + } + ] + }, + { + "statement_id": 1, + "series": [ + { + "name": "cpu_load_short", + "columns": [ + "time", + "count" + ], + "values": [ + [ + "1970-01-01T00:00:00Z", + 3 + ] + ] + } + ] + } + ] +} +``` + +### Other options when querying data +--- +#### Timestamp Format +Everything in InfluxDB is stored and reported in UTC. +By default, timestamps are returned in RFC3339 UTC and have nanosecond precision, for example `2015-08-04T19:05:14.318570484Z`. +If you want timestamps in Unix epoch format include in your request the query string parameter `epoch` where `epoch=[h,m,s,ms,u,ns]`. +For example, get epoch in seconds with: +
+```bash +curl -G 'http://localhost:8086/query' --data-urlencode "db=mydb" --data-urlencode "epoch=s" --data-urlencode "q=SELECT \"value\" FROM \"cpu_load_short\" WHERE \"region\"='us-west'" +``` + +#### Authentication +Authentication in InfluxDB is disabled by default. +See [Authentication and Authorization](/influxdb/v1.3/query_language/authentication_and_authorization/) for how to enable and set up authentication. + +#### Maximum Row Limit +The [`max-row-limit` configuration option](/influxdb/v1.3/administration/config/#max-row-limit-0) allows users to limit the maximum number of returned results to prevent InfluxDB from running out of memory while it aggregates the results. +The `max-row-limit` configuration option is set to `0` by default. +That default setting allows for an unlimited number of rows returned per request. + +The maximum row limit only applies to non-chunked queries. Chunked queries can return an unlimited number of points. + +#### Chunking +Chunking can be used to return results in streamed batches rather than as a single response by setting the query string parameter `chunked=true`. Responses will be chunked by series or by every 10,000 points, whichever occurs first. To change the maximum chunk size to a different value, set the query string parameter `chunk_size` to a different value. +For example, get your results in batches of 20,000 points with: +
+```bash +curl -G 'http://localhost:8086/query' --data-urlencode "db=deluge" --data-urlencode "chunked=true" --data-urlencode "chunk_size=20000" --data-urlencode "q=SELECT * FROM liters" +``` + +### InfluxQL +--- +Now that you know how to query data, check out the [Data Exploration page](/influxdb/v1.3/query_language/data_exploration/) to get acquainted with InfluxQL. +For more information about querying data with the HTTP API, please see the [API reference documentation](/influxdb/v1.3/tools/api/#query). diff --git a/content/influxdb/v1.3/guides/writing_data.md b/content/influxdb/v1.3/guides/writing_data.md new file mode 100644 index 000000000..f8ddfcef4 --- /dev/null +++ b/content/influxdb/v1.3/guides/writing_data.md @@ -0,0 +1,157 @@ +--- +title: Writing Data with the HTTP API + +menu: + influxdb_1_3: + weight: 1 + parent: Guides +--- + +There are many ways to write data into InfluxDB including the [command line interface](/influxdb/v1.3/tools/shell/), [client libraries](/influxdb/v1.3/clients/api/) and plugins for common data formats such as [Graphite](/influxdb/v1.3/write_protocols/graphite/). +Here we'll show you how to create a database and write data to it using the built-in HTTP API. + +## Creating a database using the HTTP API +To create a database send a `POST` request to the `/query` endpoint and set the URL parameter `q` to `CREATE DATABASE `. +The example below sends a request to InfluxDB running on `localhost` and creates the database `mydb`: +
+ +```bash +curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE mydb" +``` + +## Writing data using the HTTP API +The HTTP API is the primary means of writing data into InfluxDB, by sending `POST` requests to the `/write` endpoint. +The example below writes a single point to the `mydb` database. +The data consist of the [measurement](/influxdb/v1.3/concepts/glossary/#measurement) `cpu_load_short`, the [tag keys](/influxdb/v1.3/concepts/glossary/#tag-key) `host` and `region` with the [tag values](/influxdb/v1.3/concepts/glossary/#tag-value) `server01` and `us-west`, the [field key](/influxdb/v1.3/concepts/glossary/#field-key) `value` with a [field value](/influxdb/v1.3/concepts/glossary/#field-value) of `0.64`, and the [timestamp](/influxdb/v1.3/concepts/glossary/#timestamp) `1434055562000000000`. +
+ +```bash +curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000' +``` +When writing points, you must specify an existing database in the `db` query parameter. +Points will be written to `db`'s default retention policy if you do not supply a retention policy via the `rp` query parameter. +See the [API Reference](/influxdb/v1.3/tools/api/#write) documentation for a complete list of the available query parameters. + +The body of the POST - we call this the [Line Protocol](/influxdb/v1.3/concepts/glossary/#line-protocol) - contains the time-series data that you wish to store. +They consist of a measurement, tags, fields, and a timestamp. +InfluxDB requires a measurement name. +Strictly speaking, tags are optional but most series include tags to differentiate data sources and to make querying both easy and efficient. +Both tag keys and tag values are strings. +Field keys are required and are always strings, and, [by default](/influxdb/v1.3/write_protocols/line_protocol_reference/#data-types), field values are floats. +The timestamp - supplied at the end of the line in Unix time in nanoseconds since January 1, 1970 UTC - is optional. +If you do not specify a timestamp InfluxDB uses the server's local nanosecond timestamp in Unix epoch. +Anything that has to do with time in InfluxDB is always UTC. + +### Writing multiple points +--- +Post multiple points to multiple series at the same time by separating each point with a new line. +Batching points in this manner results in much higher performance. + +The following example writes three points to the database `mydb`. +The first point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02` and has the server's local timestamp. +The second point belongs to the series with the measurement `cpu_load_short` and tag set `host=server02,region=us-west` and has the specified timestamp `1422568543702900257`. +The third point has the same specified timestamp as the second point, but it is written to the series with the measurement `cpu_load_short` and tag set `direction=in,host=server01,region=us-west`. +
+ +```bash +curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67 +cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257 +cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257' +``` + +### Writing points from a file +--- +Write points from a file by passing `@filename` to `curl`. +The data in the file should follow InfluxDB's [line protocol syntax](/influxdb/v1.3/write_protocols/write_syntax/). + +Example of a properly-formatted file (`cpu_data.txt`): +
+```txt +cpu_load_short,host=server02 value=0.67 +cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257 +cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257 +``` + +Write the data in `cpu_data.txt` to the `mydb` database with: +
+`curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary @cpu_data.txt` + +> **Note:** If your data file has more than 5,000 points, it may be necessary to split that file into several files in order to write your data in batches to InfluxDB. +By default, the HTTP request times out after five seconds. +InfluxDB will still attempt to write the points after that time out but there will be no confirmation that they were successfully written. + +### Schemaless Design +--- +InfluxDB is a schemaless database. +You can add new measurements, tags, and fields at any time. +Note that if you attempt to write data with a different type than previously used (for example, writing a string to a field that previously accepted integers), InfluxDB will reject those data. + +### A note on REST... +--- +InfluxDB uses HTTP solely as a convenient and widely supported data transfer protocol. + + +Modern web APIs have settled on REST because it addresses a common need. +As the number of endpoints grows the need for an organizing system becomes pressing. +REST is the industry agreed style for organizing large numbers of endpoints. +This consistency is good for those developing and consuming the API: everyone involved knows what to expect. + +REST, however, is a convention. +InfluxDB makes do with three API endpoints. +This simple, easy to understand system uses HTTP as a transfer method for [InfluxQL](/influxdb/v1.3/query_language/spec/). +The InfluxDB API makes no attempt to be RESTful. + +### HTTP response summary +--- +* 2xx: If your write request received `HTTP 204 No Content`, it was a success! +* 4xx: InfluxDB could not understand the request. +* 5xx: The system is overloaded or significantly impaired. + +**Examples of error responses:** + +* Writing a float to a field that previously accepted booleans: + +```bash +curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=true' + +curl -i -XPOST 'http://localhost:8086/write?db=hamlet' --data-binary 'tobeornottobe booleanonly=5' +``` + +returns: +
+ +```bash +HTTP/1.1 400 Bad Request +Content-Type: application/json +Request-Id: [...] +X-Influxdb-Version: 1.3.x +Date: Wed, 01 Mar 2017 19:38:01 GMT +Content-Length: 150 + +{"error":"field type conflict: input field \"booleanonly\" on measurement \"tobeornottobe\" is type float, already exists as type boolean dropped=1"} +``` + +* Writing a point to a database that doesn't exist: + +```bash +curl -i -XPOST 'http://localhost:8086/write?db=atlantis' --data-binary 'liters value=10' +``` + +returns: +
+ +```bash +HTTP/1.1 404 Not Found +Content-Type: application/json +Request-Id: [...] +X-Influxdb-Version: 1.3.x +Date: Wed, 01 Mar 2017 19:38:35 GMT +Content-Length: 45 + +{"error":"database not found: \"atlantis\""} +``` + +### Next steps +--- +Now that you know how to write data with the built-in HTTP API discover how to query them with the [Querying Data](/influxdb/v1.3/guides/querying_data/) guide! +For more information about writing data with the HTTP API, please see the [API reference documentation](/influxdb/v1.3/tools/api/#write). diff --git a/content/influxdb/v1.3/high_availability/_index.md b/content/influxdb/v1.3/high_availability/_index.md new file mode 100644 index 000000000..083b1224a --- /dev/null +++ b/content/influxdb/v1.3/high_availability/_index.md @@ -0,0 +1,13 @@ +--- +title: High availability with InfluxDB Enterprise +menu: + influxdb_1_3: + name: High availability + weight: 100 +--- + +## [Clustering](/influxdb/v1.3/high_availability/clusters/) +Open-source InfluxDB does not support clustering. +For high availability or horizontal scaling of InfluxDB, please investigate our +commercial clustered offering, +[InfluxEnterprise](https://portal.influxdata.com/). diff --git a/content/influxdb/v1.3/high_availability/clusters.md b/content/influxdb/v1.3/high_availability/clusters.md new file mode 100644 index 000000000..df529338c --- /dev/null +++ b/content/influxdb/v1.3/high_availability/clusters.md @@ -0,0 +1,17 @@ +--- +title: Clustering +aliases: + - influxdb/v1.3/clustering/ + - influxdb/v1.3/clustering/cluster_setup/ + - influxdb/v1.3/clustering/cluster_node_config/ + - influxdb/v1.3/guides/clustering/ +menu: + influxdb_1_3: + weight: 1 + parent: High availability +--- + +InfluxDB OSS does not support clustering. +For high availability or horizontal scaling of InfluxDB, consider the InfluxData +commercial clustered offering, +[InfluxDB Enterprise](/enterprise_influxdb/latest/). diff --git a/content/influxdb/v1.3/introduction/_index.md b/content/influxdb/v1.3/introduction/_index.md new file mode 100644 index 000000000..90581aba9 --- /dev/null +++ b/content/influxdb/v1.3/introduction/_index.md @@ -0,0 +1,22 @@ +--- +title: Introducing InfluxDB OSS +menu: + influxdb_1_3: + name: Introduction + identifier: intro-1-3 + weight: 20 +--- + +The introductory documentation includes all the information you need to get up and running with InfluxDB. + +## [Downloading](https://influxdata.com/downloads/#influxdb) + +Provides the location to download the latest stable and nightly builds of InfluxDB. + +## [Installing InfluxDB](/influxdb/v1.3/introduction/installation/) + +Provides instructions for installing InfluxDB on Ubuntu, Debian, Red Hat, CentOS, and macOS. + +## [Getting started with InfluxDB](/influxdb/v1.3/introduction/getting_started/) + +A introductory guide to reading and writing time series data using InfluxDB. diff --git a/content/influxdb/v1.3/introduction/getting_started.md b/content/influxdb/v1.3/introduction/getting_started.md new file mode 100644 index 000000000..59ce64bab --- /dev/null +++ b/content/influxdb/v1.3/introduction/getting_started.md @@ -0,0 +1,191 @@ +--- +title: Getting Started +menu: + influxdb_1_3: + weight: 20 + parent: intro-1-3 +--- + +With InfluxDB [installed](/influxdb/v1.3/introduction/installation), you're ready to start doing some awesome things. +In this section we'll use the `influx` [command line interface](/influxdb/v1.3/tools/shell/) (CLI), which is included in all +InfluxDB packages and is a lightweight and simple way to interact with the database. +The CLI communicates with InfluxDB directly by making requests to the InfluxDB HTTP API over port `8086` by default. + +> **Note:** The database can also be used by making raw HTTP requests. +See [Writing Data](/influxdb/v1.3/guides/writing_data/) and [Querying Data](/influxdb/v1.3/guides/querying_data/) +for examples with the `curl` application. + +## Creating a database + +If you've installed InfluxDB locally, the `influx` command should be available via the command line. +Executing `influx` will start the CLI and automatically connect to the local InfluxDB instance +(assuming you have already started the server with `service influxdb start` or by running `influxd` directly). +The output should look like this: + +```bash +$ influx -precision rfc3339 +Connected to http://localhost:8086 version 1.3.x +InfluxDB shell 1.3.x +> +``` + +> **Notes:** +> +* The InfluxDB HTTP API runs on port `8086` by default. +Therefore, `influx` will connect to port `8086` and `localhost` by default. +If you need to alter these defaults, run `influx --help`. +* The [`-precision` argument](/influxdb/v1.3/tools/shell/#influx-arguments) specifies the format/precision of any returned timestamps. +In the example above, `rfc3339` tells InfluxDB to return timestamps in [RFC3339 format](https://www.ietf.org/rfc/rfc3339.txt) (`YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ`). + +The command line is now ready to take input in the form of the Influx Query Language (a.k.a InfluxQL) statements. +To exit the InfluxQL shell, type `exit` and hit return. + +A fresh install of InfluxDB has no databases (apart from the system `_internal`), +so creating one is our first task. +You can create a database with the `CREATE DATABASE ` InfluxQL statement, +where `` is the name of the database you wish to create. +Names of databases can contain any unicode character as long as the string is double-quoted. +Names can also be left unquoted if they contain _only_ ASCII letters, +digits, or underscores and do not begin with a digit. + +Throughout this guide, we'll use the database name `mydb`: + +```sql +> CREATE DATABASE mydb +> +``` + +> **Note:** After hitting enter, a new prompt appears and nothing else is displayed. +In the CLI, this means the statement was executed and there were no errors to display. +There will always be an error displayed if something went wrong. +No news is good news! + +Now that the `mydb` database is created, we'll use the `SHOW DATABASES` statement +to display all existing databases: + +```sql +> SHOW DATABASES +name: databases +--------------- +name +_internal +mydb + +> +``` + +> **Note:** The `_internal` database is created and used by InfluxDB to store internal runtime metrics. +Check it out later to get an interesting look at how InfluxDB is performing under the hood. + +Unlike `SHOW DATABASES`, most InfluxQL statements must operate against a specific database. +You may explicitly name the database with each query, +but the CLI provides a convenience statement, `USE `, +which will automatically set the database for all future requests. For example: + +```sql +> USE mydb +Using database mydb +> +``` + +Now future commands will only be run against the `mydb` database. + +## Writing and exploring data + +Now that we have a database, InfluxDB is ready to accept queries and writes. + +First, a short primer on the datastore. +Data in InfluxDB is organized by "time series", +which contain a measured value, like "cpu_load" or "temperature". +Time series have zero to many `points`, one for each discrete sample of the metric. +Points consist of `time` (a timestamp), a `measurement` ("cpu_load", for example), +at least one key-value `field` (the measured value itself, e.g. +"value=0.64", or "temperature=21.2"), and zero to many key-value `tags` containing any metadata about the value (e.g. +"host=server01", "region=EMEA", "dc=Frankfurt"). + +Conceptually you can think of a `measurement` as an SQL table, +where the primary index is always time. +`tags` and `fields` are effectively columns in the table. +`tags` are indexed, and `fields` are not. +The difference is that, with InfluxDB, you can have millions of measurements, +you don't have to define schemas up-front, and null values aren't stored. + +Points are written to InfluxDB using the Line Protocol, which follows the following format: + +``` +[,=...] =[,=...] [unix-nano-timestamp] +``` + +The following lines are all examples of points that can be written to InfluxDB: + +``` +cpu,host=serverA,region=us_west value=0.64 +payment,device=mobile,product=Notepad,method=credit billed=33,licenses=3i 1434067467100293230 +stock,symbol=AAPL bid=127.46,ask=127.48 +temperature,machine=unit42,type=assembly external=25,internal=37 1434067467000000000 +``` + +> **Note:** More information on the line protocol can be found on the [Write Syntax](/influxdb/v1.3/write_protocols/write_syntax/) page. + +To insert a single time-series datapoint into InfluxDB using the CLI, enter `INSERT` followed by a point: + +```sql +> INSERT cpu,host=serverA,region=us_west value=0.64 +> +``` + +A point with the measurement name of `cpu` and tags `host` and `region` has now been written to the database, with the measured `value` of `0.64`. + +Now we will query for the data we just wrote: + +```sql +> SELECT "host", "region", "value" FROM "cpu" +name: cpu +--------- +time host region value +2015-10-21T19:28:07.580664347Z serverA us_west 0.64 + +> +``` + +> **Note:** We did not supply a timestamp when writing our point. +When no timestamp is supplied for a point, InfluxDB assigns the local current timestamp when the point is ingested. +That means your timestamp will be different. + +Let's try storing another type of data, with two fields in the same measurement: + +```sql +> INSERT temperature,machine=unit42,type=assembly external=25,internal=37 +> +``` + +To return all fields and tags with a query, you can use the `*` operator: + +```sql +> SELECT * FROM "temperature" +name: temperature +----------------- +time external internal machine type +2015-10-21T19:28:08.385013942Z 25 37 unit42 assembly + +> +``` + +InfluxQL has many [features and keywords](/influxdb/v1.3/query_language/spec/) that are not covered here, +including support for Go-style regex. For example: + +```sql +> SELECT * FROM /.*/ LIMIT 1 +-- +> SELECT * FROM "cpu_load_short" +-- +> SELECT * FROM "cpu_load_short" WHERE "value" > 0.9 +``` + +This is all you need to know to write data into InfluxDB and query it back. +To learn more about the InfluxDB write protocol, +check out the guide on [Writing Data](/influxdb/v1.3/guides/writing_data/). +To further explore the query language, +check out the guide on [Querying Data](/influxdb/v1.3/guides/querying_data/). +For more information on InfluxDB concepts, check out the [Key Concepts] +(/influxdb/v1.3/concepts/key_concepts/) page. diff --git a/content/influxdb/v1.3/introduction/installation.md b/content/influxdb/v1.3/introduction/installation.md new file mode 100644 index 000000000..704633eab --- /dev/null +++ b/content/influxdb/v1.3/introduction/installation.md @@ -0,0 +1,288 @@ +--- +title: Installation +menu: + influxdb_1_3: + weight: 10 + parent: intro-1-3 +--- + +This page provides directions for installing, starting, and configuring InfluxDB. + +## Requirements + +Installation of the InfluxDB package may require `root` or administrator privileges in order to complete successfully. + +### Networking + +By default, InfluxDB uses the following network ports: + +- TCP port `8086` is used for client-server communication over InfluxDB's HTTP API +- TCP port `8088` is used for the RPC service for backup and restore + +In addition to the ports above, InfluxDB also offers multiple plugins that may +require [custom ports](/influxdb/v1.3/administration/ports/). +All port mappings can be modified through the [configuration file](/influxdb/v1.3/administration/config), +which is located at `/etc/influxdb/influxdb.conf` for default installations. + +### NTP + +InfluxDB uses a host's local time in UTC to assign timestamps to data and for +coordination purposes. +Use the Network Time Protocol (NTP) to synchronize time between hosts; if hosts' +clocks aren't synchronized with NTP, the timestamps on the data written to InfluxDB +can be inaccurate. + +## Installation + +For users who don't want to install any software and are ready to use InfluxDB, +you may want to check out our +[managed hosted InfluxDB offering](https://cloud.influxdata.com). + +{{< tabs-wrapper >}} +{{% tabs %}} +[Ubuntu & Debian](#) +[Red Hat & CentOS](#) +[SLES & openSUSE](#) +[FreeBSD/PC-BSD](#) +[macOS](#) +{{% /tabs %}} + +{{% tab-content %}} +For instructions on how to install the Debian package from a file, +please see the +[downloads page](https://influxdata.com/downloads/). + +Debian and Ubuntu +users can install the latest stable version of InfluxDB using the +`apt-get` package manager. + +For Ubuntu users, add the InfluxData repository with the following commands: + +```bash +curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add - +source /etc/lsb-release +echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list +``` + +For Debian users, add the InfluxData repository: + +```bash +curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add - +source /etc/os-release +test $VERSION_ID = "7" && echo "deb https://repos.influxdata.com/debian wheezy stable" | sudo tee /etc/apt/sources.list.d/influxdb.list +test $VERSION_ID = "8" && echo "deb https://repos.influxdata.com/debian jessie stable" | sudo tee /etc/apt/sources.list.d/influxdb.list +test $VERSION_ID = "9" && echo "deb https://repos.influxdata.com/debian stretch stable" | sudo tee /etc/apt/sources.list.d/influxdb.list +``` + +Then, install and start the InfluxDB service: + +```bash +sudo apt-get update && sudo apt-get install influxdb +sudo service influxdb start +``` + +Or if your operating system is using systemd (Ubuntu 15.04+, Debian 8+): + +```bash +sudo apt-get update && sudo apt-get install influxdb +sudo systemctl start influxdb +``` + +{{% /tab-content %}} + +{{% tab-content %}} + +For instructions on how to install the RPM package from a file, please see the [downloads page](https://influxdata.com/downloads/). + +Red Hat and CentOS users can install the latest stable version of InfluxDB using the `yum` package manager: + +```bash +cat <}} + +## Configuration + +The system has internal defaults for every configuration file setting. +View the default configuration settings with the `influxd config` command. + +Most of the settings in the local configuration file +(`/etc/influxdb/influxdb.conf`) are commented out; all +commented-out settings will be determined by the internal defaults. +Any uncommented settings in the local configuration file override the +internal defaults. +Note that the local configuration file does not need to include every +configuration setting. + +There are two ways to launch InfluxDB with your configuration file: + +* Point the process to the correct configuration file by using the `-config` +option: + + ```bash + influxd -config /etc/influxdb/influxdb.conf + ``` +* Set the environment variable `INFLUXDB_CONFIG_PATH` to the path of your +configuration file and start the process. +For example: + + ``` + echo $INFLUXDB_CONFIG_PATH + /etc/influxdb/influxdb.conf + + influxd + ``` + +InfluxDB first checks for the `-config` option and then for the environment +variable. + +See the [Configuration](/influxdb/v1.3/administration/config/) documentation for more information. + +### Data & WAL Directory Permissions + +Make sure the directories in which data and the [write ahead log (WASL)](/influxdb/v1.3/concepts/glossary/#wal-write-ahead-log) are stored are writable for the user running the `influxd` service. + +> **Note:** If the data and WAL directories are not writable, the `influxd` service will not start. + +Information about `data` and `wal` directory paths is available in the [Configuration ](/influxdb/v1.3/administration/config/#data) documentation. + +## Hosting on AWS + +### Hardware + +We recommend using two SSD volumes. +One for the `influxdb/wal` and one for the `influxdb/data`. +Depending on your load each volume should have around 1k-3k provisioned IOPS. +The `influxdb/data` volume should have more disk space with lower IOPS and the `influxdb/wal` volume should have less disk space with higher IOPS. + +Each machine should have a minimum of 8G RAM. + +We’ve seen the best performance with the R4 class of machines, as they provide more memory than either of the C3/C4 class and the M4 class. + +### Configuring the Instance + +This example assumes that you are using two SSD volumes and that you have mounted them appropriately. +This example also assumes that each of those volumes is mounted at `/mnt/influx` and `/mnt/db`. +For more information on how to do that see the Amazon documentation on how to [Add a Volume to Your Instance](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-attaching-volume.html). + +### Config File +You'll have to update the config file appropriately for each InfluxDB instance you have. + +``` +... + +[meta] + dir = "/mnt/db/meta" + ... + +... + +[data] + dir = "/mnt/db/data" + ... +wal-dir = "/mnt/influx/wal" + ... + +... + +[hinted-handoff] + ... +dir = "/mnt/db/hh" + ... +``` + +### Permissions + +When using non-standard directories for InfluxDB data and configurations, also be sure to set filesystem permissions correctly: + +```bash +chown influxdb:influxdb /mnt/influx +chown influxdb:influxdb /mnt/db +``` diff --git a/content/influxdb/v1.3/query_language/_index.md b/content/influxdb/v1.3/query_language/_index.md new file mode 100644 index 000000000..9b3176669 --- /dev/null +++ b/content/influxdb/v1.3/query_language/_index.md @@ -0,0 +1,82 @@ +--- +title: Query Language +menu: + influxdb_1_3: + weight: 70 + identifier: influxql +--- + +This section introduces InfluxQL, InfluxDB's SQL-like query language for +interacting with data in InfluxDB. + +## InfluxQL Tutorial +The first seven documents in this section provide a tutorial-style introduction +to InfluxQL. +Feel free to download the dataset provided in +[Sample Data](/influxdb/v1.3/query_language/data_download/) and follow along +with the documentation. + +#### [Data Exploration](/influxdb/v1.3/query_language/data_exploration/) + +Covers the query language basics for InfluxQL, including the +[`SELECT` statement](/influxdb/v1.3/query_language/data_exploration/#the-basic-select-statement), +[`GROUP BY` clauses](/influxdb/v1.3/query_language/data_exploration/#the-group-by-clause), +[`INTO` clauses](/influxdb/v1.3/query_language/data_exploration/#the-into-clause), and more. +See Data Exploration to learn about +[time syntax](/influxdb/v1.3/query_language/data_exploration/#time-syntax) and +[regular expressions](/influxdb/v1.3/query_language/data_exploration/#regular-expressions) in +queries. + +#### [Schema Exploration](/influxdb/v1.3/query_language/schema_exploration/) + +Covers queries that are useful for viewing and exploring your +[schema](/influxdb/v1.3/concepts/glossary/#schema). +See Schema Exploration for syntax explanations and examples of InfluxQL's `SHOW` +queries. + +#### [Database Management](/influxdb/v1.3/query_language/database_management/) + +Covers InfluxQL for managing +[databases](/influxdb/v1.3/concepts/glossary/#database) and +[retention policies](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) in +InfluxDB. +See Database Management for creating and dropping databases and retention +policies as well as deleting and dropping data. + +#### [Functions](/influxdb/v1.3/query_language/functions/) + +Covers all [InfluxQL functions](/influxdb/v1.3/query_language/functions/). + +#### [Continuous Queries](/influxdb/v1.3/query_language/continuous_queries/) + +Covers the +[basic syntax](/influxdb/v1.3/query_language/continuous_queries/#basic-syntax) +, +[advanced syntax](/influxdb/v1.3/query_language/continuous_queries/#advanced-syntax) +, +and +[common use cases](/influxdb/v1.3/query_language/continuous_queries/#continuous-query-use-cases) +for +[Continuous Queries](/influxdb/v1.3/concepts/glossary/#continuous-query-cq). +This page also describes how to +[`SHOW`](/influxdb/v1.3/query_language/continuous_queries/#list-cqs) and +[`DROP`](/influxdb/v1.3/query_language/continuous_queries/#delete-cqs) +Continuous Queries. + +#### [Mathematical Operators](/influxdb/v1.3/query_language/math_operators/) + +Covers the use of mathematical operators in InfluxQL. + +#### [Authentication and Authorization](/influxdb/v1.3/query_language/authentication_and_authorization/) + +Covers how to +[set up authentication](/influxdb/v1.3/query_language/authentication_and_authorization/#set-up-authentication) +and how to +[authenticate requests](/influxdb/v1.3/query_language/authentication_and_authorization/#authenticate-requests) in InfluxDB. +This page also describes the different +[user types](/influxdb/v1.3/query_language/authentication_and_authorization/#user-types-and-privileges) and the InfluxQL for +[managing database users](/influxdb/v1.3/query_language/authentication_and_authorization/#user-management-commands). + +## [InfluxQL Reference](/influxdb/v1.3/query_language/spec/) + +The reference documentation for InfluxQL. diff --git a/content/influxdb/v1.3/query_language/authentication_and_authorization.md b/content/influxdb/v1.3/query_language/authentication_and_authorization.md new file mode 100644 index 000000000..ccd64928c --- /dev/null +++ b/content/influxdb/v1.3/query_language/authentication_and_authorization.md @@ -0,0 +1,429 @@ +--- +title: Authentication and Authorization +aliases: + - influxdb/v1.3/administration/authentication_and_authorization/ +menu: + influxdb_1_3: + weight: 90 + parent: influxql +--- + +This document covers setting up and managing authentication and authorization in InfluxDB. + + + + + + + + + + + + + + + + + + +
Authentication:Authorization:
Set up AuthenticationUser Types and Privileges
Authenticate RequestsUser Management Commands
HTTP ErrorsHTTP Errors
+ +> **Note:** Authentication and authorization should not be relied upon to prevent access and protect data from malicious actors. +If additional security or compliance features are desired, InfluxDB should be run behind a third party service. + +## Authentication + +InfluxDB's HTTP API and the [command line interface](/influxdb/v1.3/tools/shell/) (CLI), which connects to the database using the API, include simple, built-in authentication based on user credentials. +When you enable authentication InfluxDB only executes HTTP requests that are sent with valid credentials. + + +> **Note:** Authentication only occurs at the HTTP request scope. +Plugins do not currently have the ability to authenticate requests and service endpoints (for example, Graphite, collectd, etc.) are not authenticated. + +### Set up Authentication + +#### 1. Create at least one [admin user](#admin-users). +See the [authorization section](#authorization) for how to create an admin user. + +> **Note:** If you enable authentication and have no users, InfluxDB will **not** enforce authentication and will only accept the [query](#user-management-commands) that creates a new admin user. + +InfluxDB will enforce authentication once there is an admin user. + +#### 2. By default, authentication is disabled in the configuration file. +Enable authentication by setting the `auth-enabled` option to `true` in the `[http]` section of the configuration file: + +``` +[http] + enabled = true + bind-address = ":8086" + auth-enabled = true # ✨ + log-enabled = true + write-tracing = false + pprof-enabled = false + https-enabled = false + https-certificate = "/etc/ssl/influxdb.pem" +``` + +#### 3. Restart the process. + +Now InfluxDB will check user credentials on every request and will only process requests that have valid credentials for an existing user. + +### Authenticate Requests + +#### Authenticate with the HTTP API +There are two options for authenticating with the [HTTP API](/influxdb/v1.3/tools/api/). + +If you authenticate with both Basic Authentication **and** the URL query parameters, the user credentials specified in the query parameters take precedence. +The queries in the following examples assume that the user is an [admin user](#admin-users). +See the section on [authorization](#authorization) for the different user types, their privileges, and more on user management. + +> **Note:** InfluxDB redacts passwords when you enable authentication. + +##### Authenticate with Basic Authentication as described in [RFC 2617, Section 2](http://tools.ietf.org/html/rfc2617) +
+This is the preferred method for providing user credentials. + +Example: + +```bash +curl -G http://localhost:8086/query -u todd:influxdb4ever --data-urlencode "q=SHOW DATABASES" +``` + +##### Authenticate by providing query parameters in the URL or request body +
+Set `u` as the username and `p` as the password. + +Example using query parameters: + +```bash +curl -G "http://localhost:8086/query?u=todd&p=influxdb4ever" --data-urlencode "q=SHOW DATABASES" +``` + +Example using request body: + +```bash +curl -G http://localhost:8086/query --data-urlencode "u=todd" --data-urlencode "p=influxdb4ever" --data-urlencode "q=SHOW DATABASES" +``` + +#### Authenticate with the CLI +There are three options for authenticating with the [CLI](/influxdb/v1.3/tools/shell/). + +##### Authenticate with the `INFLUX_USERNAME` and `INFLUX_PASSWORD` environment variables +
+Example: + +``` +export INFLUX_USERNAME todd +export INFLUX_PASSWORD influxdb4ever +echo $INFLUX_USERNAME $INFLUX_PASSWORD +todd influxdb4ever + +influx +Connected to http://localhost:8086 version 1.3.x +InfluxDB shell 1.3.x +``` + +##### Authenticate by setting the `username` and `password` flags when you start the CLI +
+Example: + +```bash +influx -username todd -password influxdb4ever +Connected to http://localhost:8086 version 1.3.x +InfluxDB shell 1.3.x +``` + +##### Authenticate with `auth ` after starting the CLI +
+Example: + +```bash +influx +Connected to http://localhost:8086 version 1.3.x +InfluxDB shell 1.3.x +> auth +username: todd +password: +> +``` + + +> +## Authenticate Telegraf requests to InfluxDB +> +Authenticating [Telegraf](/telegraf/v1.3/) requests to an InfluxDB instance with +authentication enabled requires some additional steps. +In Telegraf's configuration file (`/etc/telegraf/telegraf.conf`), uncomment +and edit the `username` and `password` settings: +> + ############################################################################### + # OUTPUT PLUGINS # + ############################################################################### +> + [...] +> + ## Write timeout (for the InfluxDB client), formatted as a string. + ## If not provided, will default to 5s. 0s means no timeout (not recommended). + timeout = "5s" + username = "telegraf" #💥 + password = "metricsmetricsmetricsmetrics" #💥 +> + [...] +> +Next, restart Telegraf and you're all set! + +## Authorization + +Authorization is only enforced once you've [enabled authentication](#set-up-authentication). +By default, authentication is disabled, all credentials are silently ignored, and all users have all privileges. + +### User Types and Privileges + +#### Admin users +Admin users have `READ` and `WRITE` access to all databases and full access to the following administrative queries: + +Database management: +   ◦      `CREATE DATABASE`, and `DROP DATABASE` +   ◦      `DROP SERIES` and `DROP MEASUREMENT` +   ◦      `CREATE RETENTION POLICY`, `ALTER RETENTION POLICY`, and `DROP RETENTION POLICY` +   ◦      `CREATE CONTINUOUS QUERY` and `DROP CONTINUOUS QUERY` + +See the [database management](/influxdb/v1.3/query_language/database_management/) and [continuous queries](/influxdb/v1.3/query_language/continuous_queries/) pages for a complete discussion of the commands listed above. + +User management: +   ◦      Admin user management: +           [`CREATE USER`](#user-management-commands), [`GRANT ALL PRIVILEGES`](#grant-administrative-privileges-to-an-existing-user), [`REVOKE ALL PRIVILEGES`](#revoke-administrative-privileges-from-an-admin-user), and [`SHOW USERS`](#show-all-existing-users-and-their-admin-status) +   ◦      Non-admin user management: +           [`CREATE USER`](#create-a-new-non-admin-user), [`GRANT [READ,WRITE,ALL]`](#grant-read-write-or-all-database-privileges-to-an-existing-user), [`REVOKE [READ,WRITE,ALL]`](#revoke-read-write-or-all-database-privileges-from-an-existing-user), and [`SHOW GRANTS`](#show-a-user-s-database-privileges) +   ◦      General user management: +           [`SET PASSWORD`](#re-set-a-user-s-password) and [`DROP USER`](#drop-a-user) + +See [below](#user-management-commands) for a complete discussion of the user management commands. + +#### Non-admin users +Non-admin users can have one of the following three privileges per database: +   ◦      `READ` +   ◦      `WRITE` +   ◦      `ALL` (both `READ` and `WRITE` access) + +`READ`, `WRITE`, and `ALL` privileges are controlled per user per database. A new non-admin user has no access to any database until they are specifically [granted privileges to a database](#grant-read-write-or-all-database-privileges-to-an-existing-user) by an admin user. +Non-admin users can [`SHOW`](/influxdb/v1.3/query_language/schema_exploration/#show-databases) the databases on which they have `READ` and/or `WRITE` permissions. + +### User Management Commands + +#### Admin user management + +When you enable HTTP authentication, InfluxDB requires you to create at least one admin user before you can interact with the system. + +`CREATE USER admin WITH PASSWORD '' WITH ALL PRIVILEGES` + +##### `CREATE` another admin user: +
+``` +CREATE USER WITH PASSWORD '' WITH ALL PRIVILEGES +``` + +CLI example: + +```bash +> CREATE USER paul WITH PASSWORD 'timeseries4days' WITH ALL PRIVILEGES +> +``` + +> **Note:** Repeating the exact `CREATE USER` statement is idempotent. If any values change the database will return a duplicate user error. See GitHub Issue [#6890](https://github.com/influxdata/influxdb/pull/6890) for details. +> +CLI example: +> + > CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES + > CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES + > CREATE USER todd WITH PASSWORD '123' WITH ALL PRIVILEGES + ERR: user already exists + > CREATE USER todd WITH PASSWORD '123456' + ERR: user already exists + > CREATE USER todd WITH PASSWORD '123456' WITH ALL PRIVILEGES + > + +##### `GRANT` administrative privileges to an existing user: +
+``` +GRANT ALL PRIVILEGES TO +``` + +CLI example: + +```bash +> GRANT ALL PRIVILEGES TO "todd" +> +``` + +##### `REVOKE` administrative privileges from an admin user: +
+``` +REVOKE ALL PRIVILEGES FROM +``` + +CLI example: + +```bash +> REVOKE ALL PRIVILEGES FROM "todd" +> +``` + +##### `SHOW` all existing users and their admin status: +
+``` +SHOW USERS +``` + +CLI example: + +```bash +> SHOW USERS +user admin +todd false +paul true +hermione false +dobby false +``` + +#### Non-admin user management +##### `CREATE` a new non-admin user: +
+``` +CREATE USER WITH PASSWORD '' +``` + +CLI example: + +```bash +> CREATE USER todd WITH PASSWORD 'influxdb41yf3' +> CREATE USER alice WITH PASSWORD 'wonder\'land' +> CREATE USER "rachel_smith" WITH PASSWORD 'asdf1234!' +> CREATE USER "monitoring-robot" WITH PASSWORD 'XXXXX' +> CREATE USER "$savyadmin" WITH PASSWORD 'm3tr1cL0v3r' +> +``` + +> **Notes:** +> +* The user value must be wrapped in double quotes if starts with a digit, is an InfluxQL keyword, contains a hyphen and or includes any special characters, for example: `!@#$%^&*()-` +* The password [string](/influxdb/v1.3/query_language/spec/#strings) must be wrapped in single quotes. +* Do not include the single quotes when authenticating requests. + +> For passwords that include a single quote or a newline character, escape the single quote or newline character with a backslash both when creating the password and when submitting authentication requests. +> +* Repeating the exact `CREATE USER` statement is idempotent. If any values change the database will return a duplicate user error. See GitHub Issue [#6890](https://github.com/influxdata/influxdb/pull/6890) for details. +> +CLI example: +> + > CREATE USER "todd" WITH PASSWORD '123456' + > CREATE USER "todd" WITH PASSWORD '123456' + > CREATE USER "todd" WITH PASSWORD '123' + ERR: user already exists + > CREATE USER "todd" WITH PASSWORD '123456' + > CREATE USER "todd" WITH PASSWORD '123456' WITH ALL PRIVILEGES + ERR: user already exists + > CREATE USER "todd" WITH PASSWORD '123456' + > + + +##### `GRANT` `READ`, `WRITE` or `ALL` database privileges to an existing user: +
+``` +GRANT [READ,WRITE,ALL] ON TO +``` + +CLI examples: + +`GRANT` `READ` access to `todd` on the `NOAA_water_database` database: + +```bash +> GRANT READ ON "NOAA_water_database" TO "todd" +> +``` + +`GRANT` `ALL` access to `todd` on the `NOAA_water_database` database: + +```bash +> GRANT ALL ON "NOAA_water_database" TO "todd" +> +``` + +##### `REVOKE` `READ`, `WRITE`, or `ALL` database privileges from an existing user: +
+``` +REVOKE [READ,WRITE,ALL] ON FROM +``` + +CLI examples: + +`REVOKE` `ALL` privileges from `todd` on the `NOAA_water_database` database: + +```bash +> REVOKE ALL ON "NOAA_water_database" FROM "todd" +> +``` + +`REVOKE` `WRITE` privileges from `todd` on the `NOAA_water_database` database: + +```bash +> REVOKE WRITE ON "NOAA_water_database" FROM "todd" +> +``` + +>**Note:** If a user with `ALL` privileges has `WRITE` privileges revoked, they are left with `READ` privileges, and vice versa. + +##### `SHOW` a user's database privileges: +
+``` +SHOW GRANTS FOR +``` + +CLI example: + +```bash +> SHOW GRANTS FOR "todd" +database privilege +NOAA_water_database WRITE +another_database_name READ +yet_another_database_name ALL PRIVILEGES +``` + +#### General admin and non-admin user management + +##### Re`SET` a user's password: +
+``` +SET PASSWORD FOR = '' +``` + +CLI example: + +```bash +> SET PASSWORD FOR "todd" = 'influxdb4ever' +> +``` + + > **Note:** The password [string](/influxdb/v1.3/query_language/spec/#strings) must be wrapped in single quotes. +Do not include the single quotes when authenticating requests. +> For passwords that include a single quote or a newline character, escape the single quote or newline character with a backslash both when creating the password and when submitting authentication requests. + +##### `DROP` a user: +
+``` +DROP USER +``` + +CLI example: + +```bash +> DROP USER "todd" +> +``` + +## Authentication and Authorization HTTP Errors + +Requests with no authentication credentials or incorrect credentials yield the `HTTP 401 Unauthorized` response. + +Requests by unauthorized users yield the `HTTP 403 Forbidden` response. diff --git a/content/influxdb/v1.3/query_language/continuous_queries.md b/content/influxdb/v1.3/query_language/continuous_queries.md new file mode 100644 index 000000000..e744d4c6d --- /dev/null +++ b/content/influxdb/v1.3/query_language/continuous_queries.md @@ -0,0 +1,945 @@ +--- +title: Continuous Queries + +menu: + influxdb_1_3: + weight: 40 + parent: influxql +--- + + +## Introduction + +Continuous Queries (CQ) are InfluxQL queries that run automatically and +periodically on realtime data and store query results in a +specified measurement. + + + + + + + + + + + + + + + + + +
Basic SyntaxAdvanced SyntaxCQ Management
Examples of Basic SyntaxExamples of Advanced SyntaxCQ Use Cases
Common Issues with Basic SyntaxCommon Issues with Advanced SyntaxFurther Reading
+ +## Syntax + +### Basic Syntax + +``` +CREATE CONTINUOUS QUERY ON +BEGIN + +END +``` + +#### Description of Basic Syntax + +##### The cq_query +
+The `cq_query` requires a +[function](/influxdb/v1.3/concepts/glossary/#function), +an [`INTO` clause](/influxdb/v1.3/query_language/spec/#clauses), +and a [`GROUP BY time()` clause](/influxdb/v1.3/query_language/spec/#clauses): + +``` +SELECT INTO FROM [WHERE ] GROUP BY time()[,] +``` + +>**Note:** Notice that the `cq_query` does not require a time range in a `WHERE` clause. +InfluxDB automatically generates a time range for the `cq_query` when it executes the CQ. +Any user-specified time ranges in the `cq_query`'s `WHERE` clause will be ignored +by the system. + +##### Schedule and Coverage +
+CQs operate on realtime data. +They use the local server’s timestamp, the `GROUP BY time()` interval, and +InfluxDB's preset time boundaries to determine when to execute and what time +range to cover in the query. + +CQs execute at the same interval as the `cq_query`'s `GROUP BY time()` interval, +and they run at the start of InfluxDB's preset time boundaries. +If the `GROUP BY time()` interval is one hour, the CQ executes at the start of +every hour. + +When the CQ executes, it runs a single query for the time range between +[`now()`](/influxdb/v1.3/concepts/glossary/#now) and `now()` minus the +`GROUP BY time()` interval. +If the `GROUP BY time()` interval is one hour and the current time is 17:00, +the query's time range is between 16:00 and 16:59.999999999. + +#### Examples of Basic Syntax + +The examples below use the following sample data in the `transportation` +database. +The measurement `bus_data` stores 15-minute resolution data on the number of bus +`passengers` and `complaints`: + +``` +name: bus_data +-------------- +time passengers complaints +2016-08-28T07:00:00Z 5 9 +2016-08-28T07:15:00Z 8 9 +2016-08-28T07:30:00Z 8 9 +2016-08-28T07:45:00Z 7 9 +2016-08-28T08:00:00Z 8 9 +2016-08-28T08:15:00Z 15 7 +2016-08-28T08:30:00Z 15 7 +2016-08-28T08:45:00Z 17 7 +2016-08-28T09:00:00Z 20 7 +``` + +##### Example 1: Automatically downsample data +
+Use a simple CQ to automatically downsample data from a single field +and write the results to another measurement in the same database. + +``` +CREATE CONTINUOUS QUERY "cq_basic" ON "transportation" +BEGIN + SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h) +END +``` + +`cq_basic` calculates the average hourly number of passengers from the +`bus_data` measurement and stores the results in the `average_passengers` +measurement in the `transportation` database. + +`cq_basic` executes at one-hour intervals, the same interval as the +`GROUP BY time()` interval. +Every hour, `cq_basic` runs a single query that covers the time range between +`now()` and `now()` minus the `GROUP BY time()` interval, that is, the time +range between `now()` and one hour prior to `now()`. + +Annotated log output on the morning of August 28, 2016: + +> +At **8:00** `cq_basic` executes a query with the time range `time >= '7:00' AND time < '08:00'`. +`cq_basic` writes one point to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T07:00:00Z 7 +> +At **9:00** `cq_basic` executes a query with the time range `time >= '8:00' AND time < '9:00'`. +`cq_basic` writes one point to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T08:00:00Z 13.75 + +Results: +``` +> SELECT * FROM "average_passengers" +name: average_passengers +------------------------ +time mean +2016-08-28T07:00:00Z 7 +2016-08-28T08:00:00Z 13.75 +``` + +##### Example 2: Automatically downsample data into another retention policy +
+[Fully qualify](/influxdb/v1.3/query_language/data_exploration/#the-basic-select-statement) +the destination measurement to store the downsampled data in a non-`DEFAULT` +[retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) (RP). + +``` +CREATE CONTINUOUS QUERY "cq_basic_rp" ON "transportation" +BEGIN + SELECT mean("passengers") INTO "transportation"."three_weeks"."average_passengers" FROM "bus_data" GROUP BY time(1h) +END +``` + +`cq_basic_rp` calculates the average hourly number of passengers from the +`bus_data` measurement and stores the results in the `transportation` database, +the `three_weeks` RP, and the `average_passengers` measurement. + +`cq_basic_rp` executes at one-hour intervals, the same interval as the +`GROUP BY time()` interval. +Every hour, `cq_basic_rp` runs a single query that covers the time range between +`now()` and `now()` minus the `GROUP BY time()` interval, that is, the time +range between `now()` and one hour prior to `now()`. + +Annotated log output on the morning of August 28, 2016: + +> +At **8:00** `cq_basic_rp` executes a query with the time range `time >= '7:00' AND time < '8:00'`. +`cq_basic_rp` writes one point to the `three_weeks` RP and the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T07:00:00Z 7 +> +At **9:00** `cq_basic_rp` executes a query with the time range +`time >= '8:00' AND time < '9:00'`. +`cq_basic_rp` writes one point to the `three_weeks` RP and the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T08:00:00Z 13.75 + +Results: +``` +> SELECT * FROM "transportation"."three_weeks"."average_passengers" +name: average_passengers +------------------------ +time mean +2016-08-28T07:00:00Z 7 +2016-08-28T08:00:00Z 13.75 +``` + +`cq_basic_rp` uses CQs and retention policies to automatically downsample data +and keep those downsampled data for an alternative length of time. +See the [Downsampling and Data Retention](/influxdb/v1.3/guides/downsampling_and_retention/) +guide for an in-depth discussion about this CQ use case. + +##### Example 3: Automatically downsample a database with backreferencing +
+Use a function with a wildcard (`*`) and `INTO` query's +[backreferencing syntax](/influxdb/v1.3/query_language/data_exploration/#the-into-clause) +to automatically downsample data from all measurements and numerical fields in +a database. + +``` +CREATE CONTINUOUS QUERY "cq_basic_br" ON "transportation" +BEGIN + SELECT mean(*) INTO "downsampled_transportation"."autogen".:MEASUREMENT FROM /.*/ GROUP BY time(30m),* +END +``` + +`cq_basic_br` calculates the 30-minute average of `passengers` and `complaints` +from every measurement in the `transportation` database (in this case, there's only the +`bus_data` measurement). +It stores the results in the `downsampled_transportation` database. + +`cq_basic_br` executes at 30 minutes intervals, the same interval as the +`GROUP BY time()` interval. +Every 30 minutes, `cq_basic_br` runs a single query that covers the time range +between `now()` and `now()` minus the `GROUP BY time()` interval, that is, +the time range between `now()` and 30 minutes prior to `now()`. + + +Annotated log output on the morning of August 28, 2016: + +> +At **7:30**, `cq_basic_br` executes a query with the time range `time >= '7:00' AND time < '7:30'`. +`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database: +> + name: bus_data + -------------- + time mean_complaints mean_passengers + 2016-08-28T07:00:00Z 9 6.5 +> +At **8:00**, `cq_basic_br` executes a query with the time range `time >= '7:30' AND time < '8:00'`. +`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database: +> + name: bus_data + -------------- + time mean_complaints mean_passengers + 2016-08-28T07:30:00Z 9 7.5 +> +[...] +> +At **9:00**, `cq_basic_br` executes a query with the time range `time >= '8:30' AND time < '9:00'`. +`cq_basic_br` writes two points to the `bus_data` measurement in the `downsampled_transportation` database: +> + name: bus_data + -------------- + time mean_complaints mean_passengers + 2016-08-28T08:30:00Z 7 16 + + +Results: + +``` +> SELECT * FROM "downsampled_transportation."autogen"."bus_data" +name: bus_data +-------------- +time mean_complaints mean_passengers +2016-08-28T07:00:00Z 9 6.5 +2016-08-28T07:30:00Z 9 7.5 +2016-08-28T08:00:00Z 8 11.5 +2016-08-28T08:30:00Z 7 16 +``` + +##### Example 4: Automatically downsample data and configure the CQ time boundaries +
+Use an +[offset interval](/influxdb/v1.3/query_language/data_exploration/#advanced-group-by-time-syntax) +in the `GROUP BY time()` clause to alter both the CQ's default execution time and +preset time boundaries. + +``` +CREATE CONTINUOUS QUERY "cq_basic_offset" ON "transportation" +BEGIN + SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h,15m) +END +``` + +`cq_basic_offset`calculates the average hourly number of passengers from the +`bus_data` measurement and stores the results in the `average_passengers` +measurement. + +`cq_basic_offset` executes at one-hour intervals, the same interval as the +`GROUP BY time()` interval. +The 15 minute offset interval forces the CQ to execute 15 minutes after the +default execution time; `cq_basic_offset` executes at 8:15 instead of 8:00. + +Every hour, `cq_basic_offset` runs a single query that covers the time range +between `now()` and `now()` minus the `GROUP BY time()` interval, that is, the +time range between `now()` and one hour prior to `now()`. +The 15 minute offset interval shifts forward the generated preset time boundaries in the +CQ's `WHERE` clause; `cq_basic_offset` queries between 7:15 and 8:14.999999999 instead of 7:00 and 7:59.999999999. + +Annotated log output on the morning of August 28, 2016: + +> +At **8:15** `cq_basic_offset` executes a query with the time range `time >= '7:15' AND time < '8:15'`. +`cq_basic_offset` writes one point to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T07:15:00Z 7.75 +> +At **9:15** `cq_basic_offset` executes a query with the time range `time >= '8:15' AND time < '9:15'`. +`cq_basic_offset` writes one point to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T08:15:00Z 16.75 + +Results: +``` +> SELECT * FROM "average_passengers" +name: average_passengers +------------------------ +time mean +2016-08-28T07:15:00Z 7.75 +2016-08-28T08:15:00Z 16.75 +``` +Notice that the timestamps are for 7:15 and 8:15 instead of 7:00 and 8:00. + +#### Common Issues with Basic Syntax + +##### Issue 1: Handling time intervals with no data +
+CQs do not write any results for a time interval if no data fall within that +time range. + +Note that the basic syntax does not support using +[`fill()`](/influxdb/v1.3/query_language/data_exploration/#group-by-time-intervals-and-fill) +to change the value reported for intervals with no data. +Basic syntax CQs ignore `fill()` if it's included in the CQ query. +A possible workaround is to use the +[advanced CQ syntax](#example-4-configure-the-cq-s-time-range-and-fill-empty-results). + +##### Issue 2: Resampling previous time intervals +
+The basic CQ runs a single query that covers the time range between `now()` +and `now()` minus the `GROUP BY time()` interval. +See the [advanced syntax](#advanced-syntax) for how to configure the query's +time range. + +##### Issue 3: Backfilling results for older data +
+CQs operate on realtime data, that is, data with timestamps that occur +relative to [`now()`](/influxdb/v1.3/concepts/glossary/#now). +Use a basic +[`INTO` query](/influxdb/v1.3/query_language/data_exploration/#the-into-clause) +to backfill results for data with older timestamps. + +##### Issue 4: Missing tags in the CQ results +
+By default, all +[`INTO` queries](/influxdb/v1.3/query_language/data_exploration/#the-into-clause) +convert any tags in the source measurement to fields in the destination +measurement. + +Include `GROUP BY *` in the CQ to preserve tags in the destination measurement. + +### Advanced Syntax + +``` +CREATE CONTINUOUS QUERY ON +RESAMPLE EVERY FOR +BEGIN + +END +``` + +#### Description of Advanced Syntax + +##### The cq_query +
+See [ Description of Basic Syntax](/influxdb/v1.3/query_language/continuous_queries/#description-of-basic-syntax). + +##### Schedule and Coverage +
+CQs operate on realtime data. With the advanced syntax, CQs use the local +server’s timestamp, the information in the `RESAMPLE` clause, and InfluxDB's +preset time boundaries to determine when to execute and what time range to +cover in the query. + +CQs execute at the same interval as the `EVERY` interval in the `RESAMPLE` +clause, and they run at the start of InfluxDB’s preset time boundaries. +If the `EVERY` interval is two hours, InfluxDB executes the CQ at the top of +every other hour. + +When the CQ executes, it runs a single query for the time range between +[`now()`](/influxdb/v1.3/concepts/glossary/#now) and `now()` minus the `FOR` interval in the `RESAMPLE` clause. +If the `FOR` interval is two hours and the current time is 17:00, the query's +time range is between 15:00 and 16:59.999999999. + +Both the `EVERY` interval and the `FOR` interval accept +[duration literals](/influxdb/v1.3/query_language/spec/#durations). +The `RESAMPLE` clause works with either or both of the `EVERY` and `FOR` intervals +configured. +CQs default to the relevant +[basic syntax behavior](/influxdb/v1.3/query_language/continuous_queries/#description-of-basic-syntax) +if the `EVERY` interval or `FOR` interval is not provided (see the first issue in +[Common Issues with Advanced Syntax](/influxdb/v1.3/query_language/continuous_queries/#common-issues-with-advanced-syntax) +for an anomalistic case). + +#### Examples of Advanced Syntax + +The examples below use the following sample data in the `transportation` database. +The measurement `bus_data` stores 15-minute resolution data on the number of bus +`passengers`: +``` +name: bus_data +-------------- +time passengers +2016-08-28T06:30:00Z 2 +2016-08-28T06:45:00Z 4 +2016-08-28T07:00:00Z 5 +2016-08-28T07:15:00Z 8 +2016-08-28T07:30:00Z 8 +2016-08-28T07:45:00Z 7 +2016-08-28T08:00:00Z 8 +2016-08-28T08:15:00Z 15 +2016-08-28T08:30:00Z 15 +2016-08-28T08:45:00Z 17 +2016-08-28T09:00:00Z 20 +``` + +##### Example 1: Configure the execution interval +
+Use an `EVERY` interval in the `RESAMPLE` clause to specify the CQ's execution +interval. + +``` +CREATE CONTINUOUS QUERY "cq_advanced_every" ON "transportation" +RESAMPLE EVERY 30m +BEGIN + SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h) +END +``` + +`cq_advanced_every` calculates the one-hour average of `passengers` +from the `bus_data` measurement and stores the results in the +`average_passengers` measurement in the `transportation` database. + +`cq_advanced_every` executes at 30-minute intervals, the same interval as the +`EVERY` interval. +Every 30 minutes, `cq_advanced_every` runs a single query that covers the time +range for the current time bucket, that is, the one-hour time bucket that +intersects with `now()`. + +Annotated log output on the morning of August 28, 2016: + +> +At **8:00**, `cq_advanced_every` executes a query with the time range `WHERE time >= '7:00' AND time < '8:00'`. +`cq_advanced_every` writes one point to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T07:00:00Z 7 +> +At **8:30**, `cq_advanced_every` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`. +`cq_advanced_every` writes one point to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T08:00:00Z 12.6667 +> +At **9:00**, `cq_advanced_every` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`. +`cq_advanced_every` writes one point to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T08:00:00Z 13.75 + + +Results: +``` +> SELECT * FROM "average_passengers" +name: average_passengers +------------------------ +time mean +2016-08-28T07:00:00Z 7 +2016-08-28T08:00:00Z 13.75 +``` + +Notice that `cq_advanced_every` calculates the result for the 8:00 time interval +twice. +First, it runs at 8:30 and calculates the average for every available data point +between 8:00 and 9:00 (`8`,`15`, and `15`). +Second, it runs at 9:00 and calculates the average for every available data +point between 8:00 and 9:00 (`8`, `15`, `15`, and `17`). +Because of the way InfluxDB +[handles duplicate points](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points) +, the second result simply overwrites the first result. + +##### Example 2: Configure the CQ's time range for resampling +
+Use a `FOR` interval in the `RESAMPLE` clause to specify the length of the CQ's +time range. + +``` +CREATE CONTINUOUS QUERY "cq_advanced_for" ON "transportation" +RESAMPLE FOR 1h +BEGIN + SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m) +END +``` + +`cq_advanced_for` calculates the 30-minute average of `passengers` +from the `bus_data` measurement and stores the results in the `average_passengers` +measurement in the `transportation` database. + +`cq_advanced_for` executes at 30-minute intervals, the same interval as the +`GROUP BY time()` interval. +Every 30 minutes, `cq_advanced_for` runs a single query that covers the time +range between `now()` and `now()` minus the `FOR` interval, that is, the time +range between `now()` and one hour prior to `now()`. + +Annotated log output on the morning of August 28, 2016: + +> +At **8:00** `cq_advanced_for` executes a query with the time range `WHERE time >= '7:00' AND time < '8:00'`. +`cq_advanced_for` writes two points to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T07:00:00Z 6.5 + 2016-08-28T07:30:00Z 7.5 +> +At **8:30** `cq_advanced_for` executes a query with the time range `WHERE time >= '7:30' AND time < '8:30'`. +`cq_advanced_for` writes two points to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T07:30:00Z 7.5 + 2016-08-28T08:00:00Z 11.5 +> +At **9:00** `cq_advanced_for` executes a query with the time range `WHERE time >= '8:00' AND time < '9:00'`. +`cq_advanced_for` writes two points to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T08:00:00Z 11.5 + 2016-08-28T08:30:00Z 16 + + +Notice that `cq_advanced_for` will calculate the result for every time interval +twice. +The CQ calculates the average for the 7:30 time interval at 8:00 and at 8:30, +and it calculates the average for the 8:00 time interval at 8:30 and 9:00. + +Results: +``` +> SELECT * FROM "average_passengers" +name: average_passengers +------------------------ +time mean +2016-08-28T07:00:00Z 6.5 +2016-08-28T07:30:00Z 7.5 +2016-08-28T08:00:00Z 11.5 +2016-08-28T08:30:00Z 16 +``` + +##### Example 3: Configure the execution interval and the CQ's time range +
+Use an `EVERY` interval and `FOR` interval in the `RESAMPLE` clause to specify +the CQ's execution interval and the length of the CQ's time range. + +``` +CREATE CONTINUOUS QUERY "cq_advanced_every_for" ON "transportation" +RESAMPLE EVERY 1h FOR 90m +BEGIN + SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m) +END +``` + +`cq_advanced_every_for` calculates the 30-minute average of +`passengers` from the `bus_data` measurement and stores the results in the +`average_passengers` measurement in the `transportation` database. + +`cq_advanced_every_for` executes at one-hour intervals, the same interval as the +`EVERY` interval. +Every hour, `cq_advanced_every_for` runs a single query that covers the time +range between `now()` and `now()` minus the `FOR` interval, that is, the time +range between `now()` and 90 minutes prior to `now()`. + +Annotated log output on the morning of August 28, 2016: + +> +At **8:00** `cq_advanced_every_for` executes a query with the time range `WHERE time >= '6:30' AND time < '8:00'`. +`cq_advanced_every_for` writes three points to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T06:30:00Z 3 + 2016-08-28T07:00:00Z 6.5 + 2016-08-28T07:30:00Z 7.5 +> +At **9:00** `cq_advanced_every_for` executes a query with the time range `WHERE time >= '7:30' AND time < '9:00'`. +`cq_advanced_every_for` writes three points to the `average_passengers` measurement: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T07:30:00Z 7.5 + 2016-08-28T08:00:00Z 11.5 + 2016-08-28T08:30:00Z 16 + +Notice that `cq_advanced_every_for` will calculate the result for every time +interval twice. +The CQ calculates the average for the 7:30 interval at 8:00 and 9:00. + +Results: +``` +> SELECT * FROM "average_passengers" +name: average_passengers +------------------------ +time mean +2016-08-28T06:30:00Z 3 +2016-08-28T07:00:00Z 6.5 +2016-08-28T07:30:00Z 7.5 +2016-08-28T08:00:00Z 11.5 +2016-08-28T08:30:00Z 16 +``` + +##### Example 4: Configure the CQ's time range and fill empty results +
+Use a `FOR` interval and `fill()` to change the value reported for time +intervals with no data. +Note that at least one data point must fall within the `FOR` interval for `fill()` +to operate. +If no data fall within the `FOR` interval the CQ writes no points to the +destination measurement. + +``` +CREATE CONTINUOUS QUERY "cq_advanced_for_fill" ON "transportation" +RESAMPLE FOR 2h +BEGIN + SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h) fill(1000) +END +``` + +`cq_advanced_for_fill` calculates the one-hour average of `passengers` from the +`bus_data` measurement and stores the results in the `average_passengers` +measurement in the `transportation` database. +Where possible, it writes the value `1000` for time intervals with no results. + +`cq_advanced_for_fill` executes at one-hour intervals, the same interval as the +`GROUP BY time()` interval. +Every hour, `cq_advanced_for_fill` runs a single query that covers the time +range between `now()` and `now()` minus the `FOR` interval, that is, the time +range between `now()` and two hours prior to `now()`. + +Annotated log output on the morning of August 28, 2016: + +> +At **6:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '4:00' AND time < '6:00'`. +`cq_advanced_for_fill` writes nothing to `average_passengers`; `bus_data` has no data +that fall within that time range. +> +At **7:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '5:00' AND time < '7:00'`. +`cq_advanced_for_fill` writes two points to `average_passengers`: +> + name: average_passengers + ------------------------ + time mean + 2016-08-28T05:00:00Z 1000 <------ fill(1000) + 2016-08-28T06:00:00Z 3 <------ average of 2 and 4 +> +[...] +> +At **11:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '9:00' AND time < '11:00'`. +`cq_advanced_for_fill` writes two points to `average_passengers`: +> + name: average_passengers + ------------------------ + 2016-08-28T09:00:00Z 20 <------ average of 20 + 2016-08-28T10:00:00Z 1000 <------ fill(1000) +> +At **12:00**, `cq_advanced_for_fill` executes a query with the time range `WHERE time >= '10:00' AND time < '12:00'`. +`cq_advanced_for_fill` writes nothing to `average_passengers`; `bus_data` has no data +that fall within that time range. + +Results: +``` +> SELECT * FROM "average_passengers" +name: average_passengers +------------------------ +time mean +2016-08-28T05:00:00Z 1000 +2016-08-28T06:00:00Z 3 +2016-08-28T07:00:00Z 7 +2016-08-28T08:00:00Z 13.75 +2016-08-28T09:00:00Z 20 +2016-08-28T10:00:00Z 1000 +``` + +> **Note:** `fill(previous)` doesn’t fill the result for a time interval if the +previous value is outside the query’s time range. +See [Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#why-does-fill-previous-return-empty-results) +for more information. + +#### Common Issues with Advanced Syntax + +##### Issue 1: If the `EVERY` interval is greater than the `GROUP BY time()` interval +
+If the `EVERY` interval is greater than the `GROUP BY time()` interval, the CQ +executes at the same interval as the `EVERY` interval and runs a single query +that covers the time range between `now()` and `now()` minus the `EVERY` +interval (not between `now()` and `now()` minus the `GROUP BY time()` interval). + +For example, if the `GROUP BY time()` interval is `5m` and the `EVERY` interval +is `10m`, the CQ executes every ten minutes. +Every ten minutes, the CQ runs a single query that covers the time range +between `now()` and `now()` minus the `EVERY` interval, that is, the time +range between `now()` and ten minutes prior to `now()`. + +This behavior is intentional and prevents the CQ from missing data between +execution times. + +##### Issue 2: If the `FOR` interval is less than the execution interval +
+If the `FOR` interval is less than the `GROUP BY time()` interval or, if +specified, the `EVERY` interval, InfluxDB returns the following error: + +``` +error parsing query: FOR duration must be >= GROUP BY time duration: must be a minimum of got +``` + +To avoid missing data between execution times, the `FOR` interval must be equal +to or greater than the `GROUP BY time()` interval or, if specified, the `EVERY` +interval. + +Currently, this is the intended behavior. +GitHub Issue [#6963](https://github.com/influxdata/influxdb/issues/6963) +outlines a feature request for CQs to support gaps in data coverage. + +## Continuous Query Management + +Only admin users are allowed to work with CQs. For more on user privileges, see [Authentication and Authorization](/influxdb/v1.3/query_language/authentication_and_authorization/#user-types-and-privileges). + +### List CQs + +List every CQ on an InfluxDB instance with: +``` +SHOW CONTINUOUS QUERIES +``` +`SHOW CONTINUOUS QUERIES` groups results by database. +##### Example +
+The output shows that the `telegraf` and `mydb` databases have CQs: +``` +> SHOW CONTINUOUS QUERIES +name: _internal +--------------- +name query + + +name: telegraf +-------------- +name query +idle_hands CREATE CONTINUOUS QUERY idle_hands ON telegraf BEGIN SELECT min(usage_idle) INTO telegraf.autogen.min_hourly_cpu FROM telegraf.autogen.cpu GROUP BY time(1h) END +feeling_used CREATE CONTINUOUS QUERY feeling_used ON telegraf BEGIN SELECT mean(used) INTO downsampled_telegraf.autogen.:MEASUREMENT FROM telegraf.autogen./.*/ GROUP BY time(1h) END + + +name: downsampled_telegraf +-------------------------- +name query + + +name: mydb +---------- +name query +vampire CREATE CONTINUOUS QUERY vampire ON mydb BEGIN SELECT count(dracula) INTO mydb.autogen.all_of_them FROM mydb.autogen.one GROUP BY time(5m) END +``` + +### Delete CQs + +Delete a CQ from a specific database with: +``` +DROP CONTINUOUS QUERY ON +``` +`DROP CONTINUOUS QUERY` returns an empty result. +##### Example +
+Drop the `idle_hands` CQ from the `telegraf` database: +``` +> DROP CONTINUOUS QUERY "idle_hands" ON "telegraf"` +> +``` + +### Alter CQs + +CQs cannot be altered once they're created. +To change a CQ, you must `DROP` and re`CREATE` it with the updated settings. + +## Continuous Query Use Cases + +### Downsampling and Data Retention + +Use CQs with InfluxDB's +[retention policies](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) +(RPs) to mitigate storage concerns. +Combine CQs and RPs to automatically downsample high precision data to a lower +precision and remove the dispensable, high precision data from the database. + +See the +[Downsampling and Data Retention](/influxdb/v1.3/guides/downsampling_and_retention/) +guide for a detailed walkthrough of this common use case. + +### Pre-calculating Expensive Queries + +Shorten query runtimes by pre-calculating expensive queries with CQs. +Use a CQ to automatically downsample commonly-queried, high precision data to a +lower precision. +Queries on lower precision data require fewer resources and return faster. + +**Tip:** Pre-calculate queries for your preferred graphing tool to accelerate +the population of graphs and dashboards. + +### Substituting for a `HAVING` Clause + +InfluxQL does not support [`HAVING` clauses](https://en.wikipedia.org/wiki/Having_(SQL\)). +Get the same functionality by creating a CQ to aggregate the data and querying +the CQ results to apply the `HAVING` clause. + +> **Note:** InfluxQL supports [subqueries](/influxdb/v1.3/query_language/data_exploration/#subqueries) which also offer similar functionality to `HAVING` clauses. +See [Data Exploration](/influxdb/v1.3/query_language/data_exploration/#subqueries) for more information. + +##### Example +
+InfluxDB does not accept the following query with a `HAVING` clause. +The query calculates the average number of `bees` at `30` minute intervals and +requests averages that are greater than `20`. +``` +SELECT mean("bees") FROM "farm" GROUP BY time(30m) HAVING mean("bees") > 20 +``` + +To get the same results: + +**1. Create a CQ** +
+This step performs the `mean("bees")` part of the query above. +Because this step creates CQ you only need to execute it once. + +The following CQ automatically calculates the average number of `bees` at +`30` minutes intervals and writes those averages to the `mean_bees` field in the +`aggregate_bees` measurement. + +``` +CREATE CONTINUOUS QUERY "bee_cq" ON "mydb" BEGIN SELECT mean("bees") AS "mean_bees" INTO "aggregate_bees" FROM "farm" GROUP BY time(30m) END +``` + +**2. Query the CQ results** +
+This step performs the `HAVING mean("bees") > 20` part of the query above. + +Query the data in the measurement `aggregate_bees` and request values of the `mean_bees` field that are greater than `20` in the `WHERE` clause: + +``` +SELECT "mean_bees" FROM "aggregate_bees" WHERE "mean_bees" > 20 +``` + +### Substituting for Nested Functions + +Some InfluxQL functions +[support nesting](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#which-influxql-functions-support-nesting) +of other functions. +Most do not. +If your function does not support nesting, you can get the same functionality using a CQ to calculate +the inner-most function. +Then simply query the CQ results to calculate the outer-most function. + +> **Note:** InfluxQL supports [subqueries](/influxdb/v1.3/query_language/data_exploration/#subqueries) which also offer the same functionality as nested functions. +See [Data Exploration](/influxdb/v1.3/query_language/data_exploration/#subqueries) for more information. + +##### Example +
+InfluxDB does not accept the following query with a nested function. +The query calculates the number of non-null values +of `bees` at `30` minute intervals and the average of those counts: +``` +SELECT mean(count("bees")) FROM "farm" GROUP BY time(30m) +``` + +To get the same results: + +**1. Create a CQ** +
+This step performs the `count("bees")` part of the nested function above. +Because this step creates a CQ you only need to execute it once. + +The following CQ automatically calculates the number of non-null values of `bees` at `30` minute intervals +and writes those counts to the `count_bees` field in the `aggregate_bees` measurement. +``` +CREATE CONTINUOUS QUERY "bee_cq" ON "mydb" BEGIN SELECT count("bees") AS "count_bees" INTO "aggregate_bees" FROM "farm" GROUP BY time(30m) END +``` + +**2. Query the CQ results** +
+This step performs the `mean([...])` part of the nested function above. + +Query the data in the measurement `aggregate_bees` to calculate the average of the +`count_bees` field: +``` +SELECT mean("count_bees") FROM "aggregate_bees" WHERE time >= AND time <= +``` + +## Further Reading + +We recommend visiting the +[Downsampling and Data Retention](/influxdb/v1.3/guides/downsampling_and_retention/) +guide to see how to combine two InfluxDB features, CQs and retention policies, +to periodically downsample data and automatically expire the dispensable high +precision data. + +Kapacitor, InfluxData's data processing engine, can do the same work as +InfluxDB's CQs. +Check out the +[Kapacitor documentation](/kapacitor/v1.3/examples/continuous_queries/) for when +to use Kapacitor instead of InfluxDB and how to perform the same CQ +functionality with a TICKscript. diff --git a/content/influxdb/v1.3/query_language/data_download.md b/content/influxdb/v1.3/query_language/data_download.md new file mode 100644 index 000000000..cd9526509 --- /dev/null +++ b/content/influxdb/v1.3/query_language/data_download.md @@ -0,0 +1,121 @@ +--- +title: Sample Data +menu: + influxdb_1_3: + weight: 5 + parent: influxql +aliases: + - /influxdb/v1.3/sample_data/data_download/ +--- + +In order to explore the query language further, these instructions help you create a database, +download and write data to that database within your InfluxDB installation. +The sample data is then used and referenced in [Data Exploration](../../query_language/data_exploration/), +[Schema Exploration](../../query_language/schema_exploration/), and [Functions](../../query_language/functions/). + +## Creating a database + +If you've installed InfluxDB locally, the `influx` command should be available via the command line. +Executing `influx` will start the CLI and automatically connect to the local InfluxDB instance +(assuming you have already started the server with `service influxdb start` or by running `influxd` directly). +The output should look like this: + +```bash +$ influx -precision rfc3339 +Connected to http://localhost:8086 version 1.3.x +InfluxDB shell 1.3.x +> +``` + +> **Notes:** +> +* The InfluxDB HTTP API runs on port `8086` by default. +Therefore, `influx` will connect to port `8086` and `localhost` by default. +If you need to alter these defaults, run `influx --help`. +* The [`-precision` argument](/influxdb/latest/tools/shell/#influx-arguments) specifies the format/precision of any returned timestamps. +In the example above, `rfc3339` tells InfluxDB to return timestamps in [RFC3339 format](https://www.ietf.org/rfc/rfc3339.txt) (`YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ`). + +The command line is now ready to take input in the form of the Influx Query Language (a.k.a InfluxQL) statements. +To exit the InfluxQL shell, type `exit` and hit return. + +A fresh install of InfluxDB has no databases (apart from the system `_internal`), +so creating one is our first task. +You can create a database with the `CREATE DATABASE ` InfluxQL statement, +where `` is the name of the database you wish to create. +Names of databases can contain any unicode character as long as the string is double-quoted. +Names can also be left unquoted if they contain _only_ ASCII letters, +digits, or underscores and do not begin with a digit. + +Throughout the query language exploration, we'll use the database name `NOAA_water_database`: + +``` +> CREATE DATABASE NOAA_water_database +> exit +``` + +### Download and write the data to InfluxDB + +From your terminal, download the text file that contains the data in [line protocol](/influxdb/v1.3/concepts/glossary/#line-protocol) format: +``` +curl https://s3.amazonaws.com/noaa.water-database/NOAA_data.txt -o NOAA_data.txt +``` + +Write the data to InfluxDB via the [CLI](../../tools/shell/): +``` +influx -import -path=NOAA_data.txt -precision=s -database=NOAA_water_database +``` + +### Test queries +```bash +$ influx -precision rfc3339 -database NOAA_water_database +Connected to http://localhost:8086 version 1.3.x +InfluxDB shell 1.3.x +> +``` + +See all five measurements: +```bash +> SHOW measurements +name: measurements +------------------ +name +average_temperature +h2o_feet +h2o_pH +h2o_quality +h2o_temperature +``` + +Count the number of non-null values of `water_level` in `h2o_feet`: +```bash +> SELECT COUNT("water_level") FROM h2o_feet +name: h2o_feet +-------------- +time count +1970-01-01T00:00:00Z 15258 +``` + +Select the first five observations in the measurement h2o_feet: + +```bash +> SELECT * FROM h2o_feet LIMIT 5 +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +2015-08-18T00:06:00Z between 6 and 9 feet coyote_creek 8.005 +2015-08-18T00:06:00Z below 3 feet santa_monica 2.116 +2015-08-18T00:12:00Z between 6 and 9 feet coyote_creek 7.887 +``` + +### Data sources and things to note +The sample data is publicly available data from the [National Oceanic and Atmospheric Administration’s (NOAA) Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html?type=Water+Levels). +The data include 15,258 observations of water levels (ft) collected every six seconds at two stations (Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period from August 18, 2015 through September 18, 2015. + +Note that the measurements `average_temperature`, `h2o_pH`, `h2o_quality`, and `h2o_temperature` contain fictional data. +Those measurements serve to illuminate query functionality in [Schema Exploration](../../query_language/schema_exploration/). + + +The `h2o_feet` measurement is the only measurement that contains the NOAA data. +Please note that the `level description` field isn't part of the original NOAA data - we snuck it in there for the sake of having a field key with a special character and string [field values](../../concepts/glossary/#field-value). diff --git a/content/influxdb/v1.3/query_language/data_exploration.md b/content/influxdb/v1.3/query_language/data_exploration.md new file mode 100644 index 000000000..09c66c1de --- /dev/null +++ b/content/influxdb/v1.3/query_language/data_exploration.md @@ -0,0 +1,3331 @@ +--- +title: Data Exploration + +menu: + influxdb_1_3: + weight: 10 + parent: influxql +--- + +InfluxQL is an SQL-like query language for interacting with data in InfluxDB. +The following sections detail InfluxQL's `SELECT` statement and useful query syntax +for exploring your data. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
The Basics:Configure Query Results:General Tips on Query Syntax:
The SELECT StatementORDER BY time DESCTime Syntax
The WHERE ClauseThe LIMIT and SLIMIT ClausesRegular Expressions
The GROUP BY ClauseThe OFFSET and SOFFSET ClausesData Types and Cast Operations
The INTO ClauseThe Time Zone ClauseMerge Behavior
Multiple Statements
Subqueries
+ +### Sample Data + +This document uses publicly available data from the +[National Oceanic and Atmospheric Administration's (NOAA) Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html?type=Water+Levels). +See the [Sample Data](/influxdb/v1.3/query_language/data_download/) page to download +the data and follow along with the example queries in the sections below. + +Start by logging into the Influx CLI: +```bash +$ influx -precision rfc3339 -database NOAA_water_database +Connected to http://localhost:8086 version 1.3.x +InfluxDB shell 1.3.x +> +``` + +Next, get acquainted with this subsample of the data in the `h2o_feet` measurement: + +name: h2o_feet +\------------------------------------ +time +           +level description +     +location +     +water_level +   +2015-08-18T00:00:00Z +   +between 6 and 9 feet +      +coyote_creek +   +8.12 +2015-08-18T00:00:00Z +   +below 3 feet +       +santa_monica +      +2.064 +2015-08-18T00:06:00Z +   +between 6 and 9 feet +     +coyote_creek +     +8.005 +2015-08-18T00:06:00Z +   +below 3 feet +       +santa_monica +      +2.116 +2015-08-18T00:12:00Z +   +between 6 and 9 feet +      +coyote_creek +   +7.887 +2015-08-18T00:12:00Z +   +below 3 feet +       +santa_monica +      +2.028 + +The data in the `h2o_feet` [measurement](/influxdb/v1.3/concepts/glossary/#measurement) +occur at six-minute time intervals. +The measurement has one [tag key](/influxdb/v1.3/concepts/glossary/#tag-key) +(`location`) which has two [tag values](/influxdb/v1.3/concepts/glossary/#tag-value): +`coyote_creek` and `santa_monica`. +The measurement also has two [fields](/influxdb/v1.3/concepts/glossary/#field): +`level description` stores string [field values](/influxdb/v1.3/concepts/glossary/#field-value) +and `water_level` stores float field values. +All of these data is in the `NOAA_water_database` [database](/influxdb/v1.3/concepts/glossary/#database). + +> **Disclaimer:** The `level description` field isn't part of the original NOAA data - we snuck it in there for the sake of having a field key with a special character and string field values. + +
+# The basic SELECT statement + +The `SELECT` statement queries data from a particular [measurement](/influxdb/v1.3/concepts/glossary/#measurement) or measurements. + +Tired of reading? Check out this InfluxQL Short: +
+
+ + +### Syntax +```sql +SELECT [,,] FROM [,] +``` + +### Description of Syntax + +The `SELECT` statement requires a `SELECT` clause and a `FROM` clause. + +#### `SELECT` clause +The `SELECT` clause supports several formats for specifying data: + +`SELECT *` +          Returns all [fields](/influxdb/v1.3/concepts/glossary/#field) and [tags](/influxdb/v1.3/concepts/glossary/#tag). + +`SELECT ""` +          Returns a specific field. + +`SELECT "",""` +          Returns more than one field. + +`SELECT "",""` +          Returns a specific field and tag. +The `SELECT` clause must specify at least one field when it includes a tag. + +`SELECT ""::field,""::tag` +          Returns a specific field and tag. +The `::[field | tag]` syntax specifies the [identifier's](/influxdb/v1.3/concepts/glossary/#identifier) type. +Use this syntax to differentiate between field keys and tag keys that have the same name. + +Other supported features: +[Arithmetic Operations](/influxdb/v1.3/query_language/math_operators/), +[Functions](/influxdb/v1.3/query_language/functions/), +[Basic Cast Operations](#data-types-and-cast-operations), +[Regular Expressions](#regular-expressions) + +#### `FROM` clause +The `FROM` clause supports several formats for specifying a [measurement(s)](/influxdb/v1.3/concepts/glossary/#measurement): + +`FROM ` +           +Returns data from a single measurement. +If you're using the [CLI](/influxdb/v1.3/tools/shell/) InfluxDB queries the measurement in the +[`USE`d](/influxdb/v1.3/tools/shell/#commands) +[database](/influxdb/v1.3/concepts/glossary/#database) and the `DEFAULT` [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp). +If you're using the [HTTP API](/influxdb/v1.3/tools/api/) InfluxDB queries the +measurement in the database specified in the [`db` query string parameter](/influxdb/v1.3/tools/api/#query-string-parameters) +and the `DEFAULT` retention policy. + +`FROM ,` +           +Returns data from more than one measurement. + +`FROM ..` +           +Returns data from a fully qualified measurement. +Fully qualify a measurement by specifying its database and retention policy. + +`FROM ..` +           +Returns data from a measurement in a user-specified [database](/influxdb/v1.3/concepts/glossary/#database) and the `DEFAULT` +[retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp). + +Other supported features: +[Regular Expressions](#regular-expressions) + +#### Quoting +[Identifiers](/influxdb/v1.3/concepts/glossary/#identifier) **must** be double quoted if they contain characters other than `[A-z,0-9,_]`, if they +begin with a digit, or if they are an [InfluxQL keyword](https://github.com/influxdata/influxql/blob/master/README.md#keywords). +While not always necessary, we recommend that you double quote identifiers. + +> **Note:** The quoting syntax for queries differs from the [line protocol](/influxdb/v1.3/concepts/glossary/#line-protocol). +Please review the [rules for single and double-quoting](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#when-should-i-single-quote-and-when-should-i-double-quote-in-queries) in queries. + +### Examples + +#### Example 1: Select all fields and tags from a single measurement +``` +> SELECT * FROM "h2o_feet" + +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +[...] +2015-09-18T21:36:00Z between 3 and 6 feet santa_monica 5.066 +2015-09-18T21:42:00Z between 3 and 6 feet santa_monica 4.938 +``` + +The query selects all [fields](/influxdb/v1.3/concepts/glossary/#field) and +[tags](/influxdb/v1.3/concepts/glossary/#tag) from the `h2o_feet` +[measurement](/influxdb/v1.3/concepts/glossary/#measurement). + +If you're using the [CLI](/influxdb/v1.3/tools/shell/) be sure to enter +`USE NOAA_water_database` before you run the query. +The CLI queries the data in the `USE`d database and the +`DEFAULT` [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp). +If you're using the [HTTP API](/influxdb/v1.3/tools/api/) be sure to set the +`db` [query string parameter](/influxdb/v1.3/tools/api/#query-string-parameters) +to `NOAA_water_database`. +If you do not set the `rp` query string parameter, the HTTP API automatically +queries the database's `DEFAULT` retention policy. + +#### Example 2: Select specific tags and fields from a single measurement +``` +> SELECT "level description","location","water_level" FROM "h2o_feet" + +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +[...] +2015-09-18T21:36:00Z between 3 and 6 feet santa_monica 5.066 +2015-09-18T21:42:00Z between 3 and 6 feet santa_monica 4.938 +``` + +The query selects the `level description` field, the `location` tag, and the +`water_level` field. +Note that the `SELECT` clause must specify at least one field when it includes +a tag. + +#### Example 3: Select specific tags and fields from a single measurement, and provide their identifier type +``` +> SELECT "level description"::field,"location"::tag,"water_level"::field FROM "h2o_feet" + +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +[...] +2015-09-18T21:36:00Z between 3 and 6 feet santa_monica 5.066 +2015-09-18T21:42:00Z between 3 and 6 feet santa_monica 4.938 +``` + +The query selects the `level description` field, the `location` tag, and the +`water_level` field from the `h2o_feet` measurement. +The `::[field | tag]` syntax specifies if the +[identifier](/influxdb/v1.3/concepts/glossary/#identifier) is a field or tag. +Use `::[field | tag]` to differentiate between [an identical field key and tag key ](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-do-i-query-data-with-an-identical-tag-key-and-field-key). +That syntax is not required for most use cases. + +#### Example 4: Select all fields from a single measurement +``` +> SELECT *::field FROM "h2o_feet" + +name: h2o_feet +-------------- +time level description water_level +2015-08-18T00:00:00Z below 3 feet 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet 8.12 +[...] +2015-09-18T21:36:00Z between 3 and 6 feet 5.066 +2015-09-18T21:42:00Z between 3 and 6 feet 4.938 +``` + +The query selects all fields from the `h2o_feet` measurement. +The `SELECT` clause supports combining the `*` syntax with the `::` syntax. + +#### Example 5: Select a specific field from a measurement and perform basic arithmetic +``` +> SELECT ("water_level" * 2) + 4 from "h2o_feet" + +name: h2o_feet +-------------- +time water_level +2015-08-18T00:00:00Z 20.24 +2015-08-18T00:00:00Z 8.128 +[...] +2015-09-18T21:36:00Z 14.132 +2015-09-18T21:42:00Z 13.876 +``` + +The query multiplies `water_level`'s field values by two and adds four to those +values. +Note that InfluxDB follows the standard order of operations. +See [Mathematical Operators](/influxdb/v1.3/query_language/math_operators/) +for more on supported operators. + +#### Example 6: Select all data from more than one measurement +``` +> SELECT * FROM "h2o_feet","h2o_pH" + +name: h2o_feet +-------------- +time level description location pH water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +[...] +2015-09-18T21:36:00Z between 3 and 6 feet santa_monica 5.066 +2015-09-18T21:42:00Z between 3 and 6 feet santa_monica 4.938 + +name: h2o_pH +------------ +time level description location pH water_level +2015-08-18T00:00:00Z santa_monica 6 +2015-08-18T00:00:00Z coyote_creek 7 +[...] +2015-09-18T21:36:00Z santa_monica 8 +2015-09-18T21:42:00Z santa_monica 7 +``` + +The query selects all fields and tags from two measurements: `h2o_feet` and +`h2o_pH`. +Separate multiple measurements with a comma (`,`). + +#### Example 7: Select all data from a fully qualified measurement +``` +> SELECT * FROM "NOAA_water_database"."autogen"."h2o_feet" + +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +[...] +2015-09-18T21:36:00Z between 3 and 6 feet santa_monica 5.066 +2015-09-18T21:42:00Z between 3 and 6 feet santa_monica 4.938 +``` + +The query selects data in the `NOAA_water_database`, the `autogen` retention +policy, and the measurement `h2o_feet`. + +In the CLI, fully qualify a measurement to query data in a database other +than the `USE`d database and in a retention policy other than the +`DEFAULT` retention policy. +In the HTTP API, fully qualify a measurement in place of using the `db` +and `rp` query string parameters if desired. + +#### Example 8: Select all data from a measurement in a particular database +``` +> SELECT * FROM "NOAA_water_database".."h2o_feet" + +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +[...] +2015-09-18T21:36:00Z between 3 and 6 feet santa_monica 5.066 +2015-09-18T21:42:00Z between 3 and 6 feet santa_monica 4.938 +``` + +The query selects data in the `NOAA_water_database`, the `DEFAULT` retention +policy, and the `h2o_feet` measurement. +The `..` indicates the `DEFAULT` retention policy for the specified database. + +In the CLI, specify the database to query data in a database other than the +`USE`d database. +In the HTTP API, specify the database in place of using the `db` query +string parameter if desired. + +### Common Issues with the SELECT statement + +#### Issue 1: Selecting tag keys in the SELECT clause +A query requires at least one [field key](/influxdb/v1.3/concepts/glossary/#field-key) +in the `SELECT` clause to return data. +If the `SELECT` clause only includes a single [tag key](/influxdb/v1.3/concepts/glossary/#tag-key) or several tag keys, the +query returns an empty response. +This behavior is a result of how the system stores data. + +##### Example +
+The following query returns no data because it specifies a single tag key (`location`) in +the `SELECT` clause: +``` +> SELECT "location" FROM "h2o_feet" +> +``` +To return any data associated with the `location` tag key, the query's `SELECT` +clause must include at least one field key (`water_level`): +``` +> SELECT "water_level","location" FROM "h2o_feet" LIMIT 3 +name: h2o_feet +time water_level location +---- ----------- -------- +2015-08-18T00:00:00Z 8.12 coyote_creek +2015-08-18T00:00:00Z 2.064 santa_monica +[...] +2015-09-18T21:36:00Z 5.066 santa_monica +2015-09-18T21:42:00Z 4.938 santa_monica +``` + +## The `WHERE` clause +The `WHERE` filters data based on +[fields](/influxdb/v1.3/concepts/glossary/#field), +[tags](/influxdb/v1.3/concepts/glossary/#tag), and/or +[timestamps](/influxdb/v1.3/concepts/glossary/#timestamp). + +Tired of reading? Check out this InfluxQL Short: +
+
+ + +### Syntax + +``` +SELECT_clause FROM_clause WHERE [(AND|OR) [...]] +``` + +### Description of Syntax + +The `WHERE` clause supports `conditional_expression`s on fields, tags, and +timestamps. + +#### fields + +``` +field_key ['string' | boolean | float | integer] +``` + +The `WHERE` clause supports comparisons against string, boolean, float, +and integer [field values](/influxdb/v1.3/concepts/glossary/#field-value). + +Single quote string field values in the `WHERE` clause. +Queries with unquoted string field values or double quoted string field values +will not return any data and, in most cases, +[will not return an error](#common-issues-with-the-where-clause). + +Supported operators: +`=`   equal to +`<>` not equal to +`!=` not equal to +`>`   greater than +`>=` greater than or equal to +`<`   less than +`<=` less than or equal to + +Other supported features: +[Arithmetic Operations](/influxdb/v1.3/query_language/math_operators/), +[Regular Expressions](#regular-expressions) + +#### tags + +``` +tag_key ['tag_value'] +``` + +Single quote [tag values](/influxdb/v1.3/concepts/glossary/#tag-value) in +the `WHERE` clause. +Queries with unquoted tag values or double quoted tag values will not return +any data and, in most cases, +[will not return an error](#common-issues-with-the-where-clause). + +Supported operators: +`=`   equal to +`<>` not equal to +`!=` not equal to + +Other supported features: +[Regular Expressions](#regular-expressions) + +#### timestamps + +For most `SELECT` statements, the default time range is between [`1677-09-21 00:12:43.145224194` and `2262-04-11T23:47:16.854775806Z` UTC](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#what-are-the-minimum-and-maximum-timestamps-that-influxdb-can-store). +For `SELECT` statements with a [`GROUP BY time()` clause](#group-by-time-intervals), the default time +range is between `1677-09-21 00:12:43.145224194` UTC and [`now()`](/influxdb/v1.3/concepts/glossary/#now). + +The [Time Syntax](#time-syntax) section on this page +details how to specify alternative time ranges in the `WHERE` clause. + +### Examples + +#### Example 1: Select data that have specific field key-values +``` +> SELECT * FROM "h2o_feet" WHERE "water_level" > 8 + +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +2015-08-18T00:06:00Z between 6 and 9 feet coyote_creek 8.005 +[...] +2015-09-18T00:12:00Z between 6 and 9 feet coyote_creek 8.189 +2015-09-18T00:18:00Z between 6 and 9 feet coyote_creek 8.084 +``` + +The query returns data from the `h2o_feet` +[measurement](/influxdb/v1.3/concepts/glossary/#measurement) with +[field values](/influxdb/v1.3/concepts/glossary/#field-value) of `water_level` +that are greater than eight. + +#### Example 2: Select data that have a specific string field key-value +``` +> SELECT * FROM "h2o_feet" WHERE "level description" = 'below 3 feet' + +name: h2o_feet +-------------- +time level description location water_level +2015-08-18T00:00:00Z below 3 feet santa_monica 2.064 +2015-08-18T00:06:00Z below 3 feet santa_monica 2.116 +[...] +2015-09-18T14:06:00Z below 3 feet santa_monica 2.999 +2015-09-18T14:36:00Z below 3 feet santa_monica 2.907 +``` + +The query returns data from the `h2o_feet` measurement with field values of +`level description` that equal the `below 3 feet` string. +InfluxQL requires single quotes around string field values in the `WHERE` +clause. + +#### Example 3: Select data that have a specific field key-value and perform basic arithmetic +``` +> SELECT * FROM "h2o_feet" WHERE "water_level" + 2 > 11.9 + +name: h2o_feet +-------------- +time level description location water_level +2015-08-29T07:06:00Z at or greater than 9 feet coyote_creek 9.902 +2015-08-29T07:12:00Z at or greater than 9 feet coyote_creek 9.938 +2015-08-29T07:18:00Z at or greater than 9 feet coyote_creek 9.957 +2015-08-29T07:24:00Z at or greater than 9 feet coyote_creek 9.964 +2015-08-29T07:30:00Z at or greater than 9 feet coyote_creek 9.954 +2015-08-29T07:36:00Z at or greater than 9 feet coyote_creek 9.941 +2015-08-29T07:42:00Z at or greater than 9 feet coyote_creek 9.925 +2015-08-29T07:48:00Z at or greater than 9 feet coyote_creek 9.902 +2015-09-02T23:30:00Z at or greater than 9 feet coyote_creek 9.902 +``` + +The query returns data from the `h2o_feet` measurement with field values of +`water_level` plus two that are greater than 11.9. +Note that InfluxDB follows the standard order of operations +See [Mathematical Operators](/influxdb/v1.3/query_language/math_operators/) +for more on supported operators. + +#### Example 4: Select data that have a specific tag key-value + +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' + +name: h2o_feet +-------------- +time water_level +2015-08-18T00:00:00Z 2.064 +2015-08-18T00:06:00Z 2.116 +[...] +2015-09-18T21:36:00Z 5.066 +2015-09-18T21:42:00Z 4.938 +``` + +The query returns data from the `h2o_feet` measurement where the +[tag key](/influxdb/v1.3/concepts/glossary/#tag-key) `location` is set to `santa_monica`. +InfluxQL requires single quotes around tag values in the `WHERE` clause. + +#### Example 5: Select data that have specific field key-values and tag key-values +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" <> 'santa_monica' AND (water_level < -0.59 OR water_level > 9.95) + +name: h2o_feet +-------------- +time water_level +2015-08-29T07:18:00Z 9.957 +2015-08-29T07:24:00Z 9.964 +2015-08-29T07:30:00Z 9.954 +2015-08-29T14:30:00Z -0.61 +2015-08-29T14:36:00Z -0.591 +2015-08-30T15:18:00Z -0.594 +``` + +The query returns data from the `h2o_feet` measurement where the tag key +`location` is not set to `santa_monica` and where the field values of +`water_level` are either less than -0.59 or greater than 9.95. +The `WHERE` clause supports the operators `AND` and `OR`, and supports +separating logic with parentheses. + +#### Example 6: Select data that have specific timestamps +``` +> SELECT * FROM "h2o_feet" WHERE time > now() - 7d +``` + +The query returns data from the `h2o_feet` measurement that have [timestamps](/influxdb/v1.3/concepts/glossary/#timestamp) +within the past seven days. +The [Time Syntax](#time-syntax) section on this page +offers in-depth information on supported time syntax in the `WHERE` clause. + +### Common Issues with the `WHERE` Clause + +#### Issue 1: A `WHERE` clause query unexpectedly returns no data + +In most cases, this issue is the result of missing single quotes around +[tag values](/influxdb/v1.3/concepts/glossary/#tag-value) +or string [field values](/influxdb/v1.3/concepts/glossary/#field-value). +Queries with unquoted or double quoted tag values or string field values will +not return any data and, in most cases, will not return an error. + +The first two queries in the code block below attempt to specify the tag value +`santa_monica` without any quotes and with double quotes. +Those queries return no results. +The third query single quotes `santa_monica` (this is the supported syntax) +and returns the expected results. +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = santa_monica + +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = "santa_monica" + +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' + +name: h2o_feet +-------------- +time water_level +2015-08-18T00:00:00Z 2.064 +[...] +2015-09-18T21:42:00Z 4.938 +``` + +The first two queries in the code block below attempt to specify the string +field value `at or greater than 9 feet` without any quotes and with double +quotes. +The first query returns an error because the string field value includes +white spaces. +The second query returns no results. +The third query single quotes `at or greater than 9 feet` (this is the +supported syntax) and returns the expected results. + +``` +> SELECT "level description" FROM "h2o_feet" WHERE "level description" = at or greater than 9 feet + +ERR: error parsing query: found than, expected ; at line 1, char 86 + +> SELECT "level description" FROM "h2o_feet" WHERE "level description" = "at or greater than 9 feet" + +> SELECT "level description" FROM "h2o_feet" WHERE "level description" = 'at or greater than 9 feet' + +name: h2o_feet +-------------- +time level description +2015-08-26T04:00:00Z at or greater than 9 feet +[...] +2015-09-15T22:42:00Z at or greater than 9 feet +``` + +
+
+# The GROUP BY clause + +The `GROUP BY` clause groups query results by a user-specified +set of [tags](/influxdb/v1.3/concepts/glossary/#tag) or a time interval. + + + + + + + + + + + + + + +
GROUP BY tags +
GROUP BY time intervals: + Basic SyntaxAdvanced SyntaxGROUP BY time intervals and fill()
+ +## GROUP BY tags + +`GROUP BY ` queries group query results by a user-specified set of [tags](/influxdb/v1.3/concepts/glossary/#tag). + +Tired of reading? Check out this InfluxQL Short: +
+
+ + +#### Syntax + +``` +SELECT_clause FROM_clause [WHERE_clause] GROUP BY [* | [,` +   Groups results by a specific tag + +`GROUP BY ,` +   Groups results by more than one tag. +The order of the [tag keys](/influxdb/v1.3/concepts/glossary/#tag-key) is irrelevant. + +If the query includes a [`WHERE` clause](#the-where-clause) the `GROUP BY` +clause must appear after the `WHERE` clause. + +Other supported features: [Regular Expressions](#regular-expressions) + +#### Examples + +##### Example 1: Group query results by a single tag +
+``` +> SELECT MEAN("water_level") FROM "h2o_feet" GROUP BY "location" + +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +1970-01-01T00:00:00Z 5.359342451341401 + + +name: h2o_feet +tags: location=santa_monica +time mean +---- ---- +1970-01-01T00:00:00Z 3.530863470081006 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `water_level` for each +[tag value](/influxdb/v1.3/concepts/glossary/#tag-value) of `location` in +the `h2o_feet` [measurement](/influxdb/v1.3/concepts/glossary/#measurement). +InfluxDB returns results in two [series](/influxdb/v1.3/concepts/glossary/#series): one for each tag value of `location`. + +>**Note:** In InfluxDB, [epoch 0](https://en.wikipedia.org/wiki/Unix_time) (`1970-01-01T00:00:00Z`) is often used as a null timestamp equivalent. +If you request a query that has no timestamp to return, such as an [aggregation function](/influxdb/v1.3/query_language/functions/) with an unbounded time range, InfluxDB returns epoch 0 as the timestamp. + +##### Example 2: Group query results by more than one tag +
+``` +> SELECT MEAN("index") FROM "h2o_quality" GROUP BY location,randtag + +name: h2o_quality +tags: location=coyote_creek, randtag=1 +time mean +---- ---- +1970-01-01T00:00:00Z 50.69033760186263 + +name: h2o_quality +tags: location=coyote_creek, randtag=2 +time mean +---- ---- +1970-01-01T00:00:00Z 49.661867544220485 + +name: h2o_quality +tags: location=coyote_creek, randtag=3 +time mean +---- ---- +1970-01-01T00:00:00Z 49.360939907550076 + +name: h2o_quality +tags: location=santa_monica, randtag=1 +time mean +---- ---- +1970-01-01T00:00:00Z 49.132712456344585 + +name: h2o_quality +tags: location=santa_monica, randtag=2 +time mean +---- ---- +1970-01-01T00:00:00Z 50.2937984496124 + +name: h2o_quality +tags: location=santa_monica, randtag=3 +time mean +---- ---- +1970-01-01T00:00:00Z 49.99919903884662 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) to calculate the average `index` for +each combination of the `location` [tag](/influxdb/v1.3/concepts/glossary/#tag) and the `randtag` tag in the +`h2o_quality` measurement. +Separate multiple tags with a comma in the `GROUP BY` clause. + +##### Example 3: Group query results by all tags +
+``` +> SELECT MEAN("index") FROM "h2o_quality" GROUP BY * + +name: h2o_quality +tags: location=coyote_creek, randtag=1 +time mean +---- ---- +1970-01-01T00:00:00Z 50.55405446521169 + + +name: h2o_quality +tags: location=coyote_creek, randtag=2 +time mean +---- ---- +1970-01-01T00:00:00Z 50.49958856271162 + + +name: h2o_quality +tags: location=coyote_creek, randtag=3 +time mean +---- ---- +1970-01-01T00:00:00Z 49.5164137518956 + + +name: h2o_quality +tags: location=santa_monica, randtag=1 +time mean +---- ---- +1970-01-01T00:00:00Z 50.43829082296367 + + +name: h2o_quality +tags: location=santa_monica, randtag=2 +time mean +---- ---- +1970-01-01T00:00:00Z 52.0688508894012 + + +name: h2o_quality +tags: location=santa_monica, randtag=3 +time mean +---- ---- +1970-01-01T00:00:00Z 49.29386362086556 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `index` for every possible +[tag](/influxdb/v1.3/concepts/glossary/#tag) combination in the `h2o_quality` +measurement. + +Note that the query results are identical to the results of the query in [Example 2](#example-2-group-query-results-by-more-than-one-tag) +where we explicitly specified the `location` and `randtag` tag keys. +This is because the `h2o_quality` measurement only has two tag keys. + +## GROUP BY time intervals + +`GROUP BY time()` queries group query results by a user-specified time interval. + +### Basic GROUP BY time() Syntax + +#### Syntax +``` +SELECT () FROM_clause WHERE GROUP BY time(),[tag_key] [fill()] +``` + +#### Description of Basic Syntax + +Basic `GROUP BY time()` queries require an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +in the [`SELECT` clause](#the-basic-select-statement) and a time range in the +[`WHERE` clause](#the-where-clause). +Note that the `GROUP BY` clause must come after the `WHERE` clause. + +##### `time(time_interval)` +
+The `time_interval` in the `GROUP BY time()` clause is a +[duration literal](/influxdb/v1.3/query_language/spec/#durations). +It determines how InfluxDB groups query results over time. +For example, a `time_interval` of `5m` groups query results into five-minute +time groups across the time range specified in the [`WHERE` clause](#the-where-clause). + +##### `fill()` +
+`fill()` is optional. +It changes the value reported for time intervals that have no data. +See [GROUP BY time intervals and `fill()`](#group-by-time-intervals-and-fill) +for more information. + +**Coverage:** + +Basic `GROUP BY time()` queries rely on the `time_interval` and on InfluxDB's +preset time boundaries to determine the raw data included in each time interval +and the timestamps returned by the query. + +#### Examples of Basic Syntax + +The examples below use the following subsample of the sample data: +``` +> SELECT "water_level","location" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' + +name: h2o_feet +-------------- +time water_level location +2015-08-18T00:00:00Z 8.12 coyote_creek +2015-08-18T00:00:00Z 2.064 santa_monica +2015-08-18T00:06:00Z 8.005 coyote_creek +2015-08-18T00:06:00Z 2.116 santa_monica +2015-08-18T00:12:00Z 7.887 coyote_creek +2015-08-18T00:12:00Z 2.028 santa_monica +2015-08-18T00:18:00Z 7.762 coyote_creek +2015-08-18T00:18:00Z 2.126 santa_monica +2015-08-18T00:24:00Z 7.635 coyote_creek +2015-08-18T00:24:00Z 2.041 santa_monica +2015-08-18T00:30:00Z 7.5 coyote_creek +2015-08-18T00:30:00Z 2.051 santa_monica +``` + +##### Example 1: Group query results into 12 minute intervals +
+``` +> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) + +name: h2o_feet +-------------- +time count +2015-08-18T00:00:00Z 2 +2015-08-18T00:12:00Z 2 +2015-08-18T00:24:00Z 2 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to count the number of `water_level` points with the [tag](/influxdb/v1.3/concepts/glossary/#tag) +`location = coyote_creek` and it group results into 12 minute intervals. + +The result for each [timestamp](/influxdb/v1.3/concepts/glossary/#timestamp) +represents a single 12 minute interval. +The count for the first timestamp covers the raw data between `2015-08-18T00:00:00Z` +and up to, but not including, `2015-08-18T00:12:00Z`. +The count for the second timestamp covers the raw data between `2015-08-18T00:12:00Z` +and up to, but not including, `2015-08-18T00:24:00Z.` + +##### Example 2: Group query results into 12 minutes intervals and by a tag key +
+``` +> SELECT COUNT("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m),"location" + +name: h2o_feet +tags: location=coyote_creek +time count +---- ----- +2015-08-18T00:00:00Z 2 +2015-08-18T00:12:00Z 2 +2015-08-18T00:24:00Z 2 + +name: h2o_feet +tags: location=santa_monica +time count +---- ----- +2015-08-18T00:00:00Z 2 +2015-08-18T00:12:00Z 2 +2015-08-18T00:24:00Z 2 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to count the number of `water_level` points. +It groups results by the `location` tag and into 12 minute intervals. +Note that the time interval and the tag key are separated by a comma in the +`GROUP BY` clause. + +The query returns two [series](/influxdb/v1.3/concepts/glossary/#series) of results: one for each +[tag value](/influxdb/v1.3/concepts/glossary/#tag-value) of the `location` tag. +The result for each timestamp represents a single 12 minute interval. +The count for the first timestamp covers the raw data between `2015-08-18T00:00:00Z` +and up to, but not including, `2015-08-18T00:12:00Z`. +The count for the second timestamp covers the raw data between `2015-08-18T00:12:00Z` +and up to, but not including, `2015-08-18T00:24:00Z.` + +#### Common Issues with Basic Syntax + +##### Issue 1: Unexpected timestamps and values in query results +
+With the basic syntax, InfluxDB relies on the `GROUP BY time()` interval +and on the system's preset time boundaries to determine the raw data included +in each time interval and the timestamps returned by the query. +In some cases, this can lead to unexpected results. + +**Example** + +Raw data: + +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:18:00Z' +name: h2o_feet +-------------- +time water_level +2015-08-18T00:00:00Z 8.12 +2015-08-18T00:06:00Z 8.005 +2015-08-18T00:12:00Z 7.887 +2015-08-18T00:18:00Z 7.762 +``` + +Query and Results: + +The following query covers a 12-minute time range and groups results into 12-minute time intervals, but it returns **two** results: + +``` +> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time < '2015-08-18T00:18:00Z' GROUP BY time(12m) + +name: h2o_feet +time count +---- ----- +2015-08-18T00:00:00Z 1 <----- Note that this timestamp occurs before the start of the query's time range +2015-08-18T00:12:00Z 1 +``` + +Explanation: + +InfluxDB uses preset round-number time boundaries for `GROUP BY` intervals that are +independent of any time conditions in the `WHERE` clause. +When it calculates the results, all returned data must occur within the query's +explicit time range but the `GROUP BY` intervals will be based on the preset +time boundaries. + +The table below shows the preset time boundary, the relevant `GROUP BY time()` interval, the +points included, and the returned timestamp for each `GROUP BY time()` +interval in the results. + +| Time Interval Number | Preset Time Boundary |`GROUP BY time()` Interval | Points Included | Returned Timestamp | +| :------------- | :------------- | :------------- | :------------- | :------------- | +| 1 | `time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:12:00Z` | `time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:12:00Z` | `8.005` | `2015-08-18T00:00:00Z` | +| 2 | `time >= 2015-08-12T00:12:00Z AND time < 2015-08-18T00:24:00Z` | `time >= 2015-08-12T00:12:00Z AND time < 2015-08-18T00:18:00Z` | `7.887` | `2015-08-18T00:12:00Z` | + +The first preset 12-minute time boundary begins at `00:00` and ends just before +`00:12`. +Only one raw point (`8.005`) falls both within the query's first `GROUP BY time()` interval and in that +first time boundary. +Note that while the returned timestamp occurs before the start of the query's time range, +the query result excludes data that occur before the query's time range. + +The second preset 12-minute time boundary begins at `00:12` and ends just before +`00:24`. +Only one raw point (`7.887`) falls both within the query's second `GROUP BY time()` interval and in that +second time boundary. + +The [advanced `GROUP BY time()` syntax](#advanced-group-by-time-syntax) allows users to shift +the start time of InfluxDB's preset time boundaries. +[Example 3](#example-3-group-query-results-into-12-minute-intervals-and-shift-the-preset-time-boundaries-forward) +in the Advanced Syntax section continues with the query shown here; +it shifts forward the preset time boundaries by six minutes such that +InfluxDB returns: + +``` +name: h2o_feet +time count +---- ----- +2015-08-18T00:06:00Z 2 +``` + +### Advanced GROUP BY time() Syntax + +#### Syntax + +``` +SELECT () FROM_clause WHERE GROUP BY time(,),[tag_key] [fill()] +``` + +#### Description of Advanced Syntax + +Advanced `GROUP BY time()` queries require an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +in the [`SELECT` clause](#the-basic-select-statement) and a time range in the +[`WHERE` clause](#the-where-clause). +Note that the `GROUP BY` clause must come after the `WHERE` clause. + +##### `time(time_interval,offset_interval)` +
+See the [Basic GROUP BY time() Syntax](#basic-group-by-time-syntax) +for details on the `time_interval`. + +The `offset_interval` is a +[duration literal](/influxdb/v1.3/query_language/spec/#durations). +It shifts forward or back InfluxDB's preset time boundaries. +The `offset_interval` can be positive or negative. + +##### `fill()` +
+`fill()` is optional. +It changes the value reported for time intervals that have no data. +See [GROUP BY time intervals and `fill()`](#group-by-time-intervals-and-fill) +for more information. + +**Coverage:** + +Advanced `GROUP BY time()` queries rely on the `time_interval`, the `offset_interval` +, and on InfluxDB's preset time boundaries to determine the raw data included in each time interval +and the timestamps returned by the query. + +#### Examples of Advanced Syntax + +The examples below use the following subsample of the sample data: + +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:54:00Z' + +name: h2o_feet +-------------- +time water_level +2015-08-18T00:00:00Z 8.12 +2015-08-18T00:06:00Z 8.005 +2015-08-18T00:12:00Z 7.887 +2015-08-18T00:18:00Z 7.762 +2015-08-18T00:24:00Z 7.635 +2015-08-18T00:30:00Z 7.5 +2015-08-18T00:36:00Z 7.372 +2015-08-18T00:42:00Z 7.234 +2015-08-18T00:48:00Z 7.11 +2015-08-18T00:54:00Z 6.982 +``` + +##### Example 1: Group query results into 18 minute intervals and shift the preset time boundaries forward +
+``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m,6m) + +name: h2o_feet +time mean +---- ---- +2015-08-18T00:06:00Z 7.884666666666667 +2015-08-18T00:24:00Z 7.502333333333333 +2015-08-18T00:42:00Z 7.108666666666667 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `water_level`, grouping results into 18 minute +time intervals, and offsetting the preset time boundaries by six minutes. + +The time boundaries and returned timestamps for the query **without** the `offset_interval` adhere to InfluxDB's preset time boundaries. Let's first examine the results without the offset: + +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m) + +name: h2o_feet +time mean +---- ---- +2015-08-18T00:00:00Z 7.946 +2015-08-18T00:18:00Z 7.6323333333333325 +2015-08-18T00:36:00Z 7.238666666666667 +2015-08-18T00:54:00Z 6.982 +``` + +The time boundaries and returned timestamps for the query **without** the +`offset_interval` adhere to InfluxDB's preset time boundaries: + +| Time Interval Number | Preset Time Boundary |`GROUP BY time()` Interval | Points Included | Returned Timestamp | +| :------------- | :------------- | :------------- | :------------- | :------------- | +| 1 | `time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:18:00Z` | `time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z` | `8.005`,`7.887` | `2015-08-18T00:00:00Z` | +| 2 | `time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z` | <--- same | `7.762`,`7.635`,`7.5` | `2015-08-18T00:18:00Z` | +| 3 | `time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z` | <--- same | `7.372`,`7.234`,`7.11` | `2015-08-18T00:36:00Z` | +| 4 | `time >= 2015-08-18T00:54:00Z AND time < 2015-08-18T01:12:00Z` | `time = 2015-08-18T00:54:00Z` | `6.982` | `2015-08-18T00:54:00Z` | + +The first preset 18-minute time boundary begins at `00:00` and ends just before +`00:18`. +Two raw points (`8.005` and `7.887`) fall both within the first `GROUP BY time()` interval and in that +first time boundary. +Note that while the returned timestamp occurs before the start of the query's time range, +the query result excludes data that occur before the query's time range. + +The second preset 18-minute time boundary begins at `00:18` and ends just before +`00:36`. +Three raw points (`7.762` and `7.635` and `7.5`) fall both within the second `GROUP BY time()` interval and in that +second time boundary. In this case, the boundary time range and the interval's time range are the same. + +The fourth preset 18-minute time boundary begins at `00:54` and ends just before +`1:12:00`. +One raw point (`6.982`) falls both within the fourth `GROUP BY time()` interval and in that +fourth time boundary. + +The time boundaries and returned timestamps for the query **with** the +`offset_interval` adhere to the offset time boundaries: + +| Time Interval Number | Offset Time Boundary |`GROUP BY time()` Interval | Points Included | Returned Timestamp | +| :------------- | :------------- | :------------- | :------------- | ------------- | +| 1 | `time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z` | <--- same | `8.005`,`7.887`,`7.762` | `2015-08-18T00:06:00Z` | +| 2 | `time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00Z` | <--- same | `7.635`,`7.5`,`7.372` | `2015-08-18T00:24:00Z` | +| 3 | `time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z` | <--- same | `7.234`,`7.11`,`6.982` | `2015-08-18T00:42:00Z` | +| 4 | `time >= 2015-08-18T01:00:00Z AND time < 2015-08-18T01:18:00Z` | NA | NA | NA | + +The six-minute offset interval shifts forward the preset boundary's time range +such that the boundary time ranges and the relevant `GROUP BY time()` interval time ranges are +always the same. +With the offset, each interval performs the calculation on three points, and +the timestamp returned matches both the start of the boundary time range and the +start of the `GROUP BY time()` interval time range. + +Note that `offset_interval` forces the fourth time boundary to be outside +the query's time range so the query returns no results for that last interval. + +##### Example 2: Group query results into 12 minute intervals and shift the preset time boundaries back +
+``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m,-12m) + +name: h2o_feet +time mean +---- ---- +2015-08-18T00:06:00Z 7.884666666666667 +2015-08-18T00:24:00Z 7.502333333333333 +2015-08-18T00:42:00Z 7.108666666666667 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `water_level`, grouping results into 18 minute +time intervals, and offsetting the preset time boundaries by -12 minutes. + +> **Note:** +The query in Example 2 returns the same results as the query in Example 1, but +the query in Example 2 uses a negative `offset_interval` instead of a positive +`offset_interval`. +There are no performance differences between the two queries; feel free to choose the most +intuitive option when deciding between a positive and negative `offset_interval`. + +The time boundaries and returned timestamps for the query **without** the `offset_interval` adhere to InfluxDB's preset time boundaries. Let's first examine the results without the offset: +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time <= '2015-08-18T00:54:00Z' GROUP BY time(18m) + +name: h2o_feet +time mean +---- ---- +2015-08-18T00:00:00Z 7.946 +2015-08-18T00:18:00Z 7.6323333333333325 +2015-08-18T00:36:00Z 7.238666666666667 +2015-08-18T00:54:00Z 6.982 +``` + +The time boundaries and returned timestamps for the query **without** the +`offset_interval` adhere to InfluxDB's preset time boundaries: + +| Time Interval Number | Preset Time Boundary |`GROUP BY time()` Interval | Points Included | Returned Timestamp | +| :------------- | :------------- | :------------- | :------------- | :------------- | +| 1 | `time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:18:00Z` | `time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z` | `8.005`,`7.887` | `2015-08-18T00:00:00Z` | +| 2 | `time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:36:00Z` | <--- same | `7.762`,`7.635`,`7.5` | `2015-08-18T00:18:00Z` | +| 3 | `time >= 2015-08-18T00:36:00Z AND time < 2015-08-18T00:54:00Z` | <--- same | `7.372`,`7.234`,`7.11` | `2015-08-18T00:36:00Z` | +| 4 | `time >= 2015-08-18T00:54:00Z AND time < 2015-08-18T01:12:00Z` | `time = 2015-08-18T00:54:00Z` | `6.982` | `2015-08-18T00:54:00Z` | + +The first preset 18-minute time boundary begins at `00:00` and ends just before +`00:18`. +Two raw points (`8.005` and `7.887`) fall both within the first `GROUP BY time()` interval and in that +first time boundary. +Note that while the returned timestamp occurs before the start of the query's time range, +the query result excludes data that occur before the query's time range. + +The second preset 18-minute time boundary begins at `00:18` and ends just before +`00:36`. +Three raw points (`7.762` and `7.635` and `7.5`) fall both within the second `GROUP BY time()` interval and in that +second time boundary. In this case, the boundary time range and the interval's time range are the same. + +The fourth preset 18-minute time boundary begins at `00:54` and ends just before +`1:12:00`. +One raw point (`6.982`) falls both within the fourth `GROUP BY time()` interval and in that +fourth time boundary. + +The time boundaries and returned timestamps for the query **with** the +`offset_interval` adhere to the offset time boundaries: + +| Time Interval Number | Offset Time Boundary |`GROUP BY time()` Interval | Points Included | Returned Timestamp | +| :------------- | :------------- | :------------- | :------------- | ------------- | +| 1 | `time >= 2015-08-17T23:48:00Z AND time < 2015-08-18T00:06:00Z` | NA | NA | NA | +| 2 | `time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:24:00Z` | <--- same | `8.005`,`7.887`,`7.762` | `2015-08-18T00:06:00Z` | +| 3 | `time >= 2015-08-18T00:24:00Z AND time < 2015-08-18T00:42:00Z` | <--- same | `7.635`,`7.5`,`7.372` | `2015-08-18T00:24:00Z` | +| 4 | `time >= 2015-08-18T00:42:00Z AND time < 2015-08-18T01:00:00Z` | <--- same | `7.234`,`7.11`,`6.982` | `2015-08-18T00:42:00Z` | + +The negative 12-minute offset interval shifts back the preset boundary's time range +such that the boundary time ranges and the relevant `GROUP BY time()` interval time ranges are always the +same. +With the offset, each interval performs the calculation on three points, and +the timestamp returned matches both the start of the boundary time range and the +start of the `GROUP BY time()` interval time range. + +Note that `offset_interval` forces the first time boundary to be outside +the query's time range so the query returns no results for that first interval. + +##### Example 3: Group query results into 12 minute intervals and shift the preset time boundaries forward +
+This example is a continuation of the scenario outlined in [Common Issues with Basic Syntax](#common-issues-with-basic-syntax). + +``` +> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time < '2015-08-18T00:18:00Z' GROUP BY time(12m,6m) + +name: h2o_feet +time count +---- ----- +2015-08-18T00:06:00Z 2 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to count the number of `water_level` points, grouping results into 12 minute +time intervals, and offsetting the preset time boundaries by six minutes. + +The time boundaries and returned timestamps for the query **without** the `offset_interval` adhere to InfluxDB's preset time boundaries. Let's first examine the results without the offset: + +``` +> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:06:00Z' AND time < '2015-08-18T00:18:00Z' GROUP BY time(12m) + +name: h2o_feet +time count +---- ----- +2015-08-18T00:00:00Z 1 +2015-08-18T00:12:00Z 1 +``` + +The time boundaries and returned timestamps for the query **without** the +`offset_interval` adhere to InfluxDB's preset time boundaries: + +| Time Interval Number | Preset Time Boundary |`GROUP BY time()` Interval | Points Included | Returned Timestamp | +| :------------- | :------------- | :------------- | :------------- | :------------- | +| 1 | `time >= 2015-08-18T00:00:00Z AND time < 2015-08-18T00:12:00Z` | `time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:12:00Z` | `8.005` | `2015-08-18T00:00:00Z` | +| 2 | `time >= 2015-08-12T00:12:00Z AND time < 2015-08-18T00:24:00Z` | `time >= 2015-08-12T00:12:00Z AND time < 2015-08-18T00:18:00Z` | `7.887` | `2015-08-18T00:12:00Z` | + +The first preset 12-minute time boundary begins at `00:00` and ends just before +`00:12`. +Only one raw point (`8.005`) falls both within the query's first `GROUP BY time()` interval and in that +first time boundary. +Note that while the returned timestamp occurs before the start of the query's time range, +the query result excludes data that occur before the query's time range. + +The second preset 12-minute time boundary begins at `00:12` and ends just before +`00:24`. +Only one raw point (`7.887`) falls both within the query's second `GROUP BY time()` interval and in that +second time boundary. + +The time boundaries and returned timestamps for the query **with** the +`offset_interval` adhere to the offset time boundaries: + +| Time Interval Number | Offset Time Boundary |`GROUP BY time()` Interval | Points Included | Returned Timestamp | +| :------------- | :------------- | :------------- | :------------- | :------------- | +| 1 | `time >= 2015-08-18T00:06:00Z AND time < 2015-08-18T00:18:00Z` | <--- same | `8.005`,`7.887` | `2015-08-18T00:06:00Z` | +| 2 | `time >= 2015-08-18T00:18:00Z AND time < 2015-08-18T00:30:00Z` | NA | NA | NA | + +The six-minute offset interval shifts forward the preset boundary's time range +such that the preset boundary time range and the relevant `GROUP BY time()` interval time range are the +same. +With the offset, the query returns a single result, and the timestamp returned +matches both the start of the boundary time range and the start of the `GROUP BY time()` interval +time range. + +Note that `offset_interval` forces the second time boundary to be outside +the query's time range so the query returns no results for that second interval. + +## `GROUP BY` time intervals and `fill()` + +`fill()` changes the value reported for time intervals that have no data. + +#### Syntax + +``` +SELECT () FROM_clause WHERE GROUP BY time(time_interval,[)] +``` + +#### Description of Syntax + +By default, a `GROUP BY time()` interval with no data reports `null` as its +value in the output column. +`fill()` changes the value reported for time intervals that have no data. +Note that `fill()` must go at the end of the `GROUP BY` clause if you're +`GROUP(ing) BY` several things (for example, both [tags](/influxdb/v1.3/concepts/glossary/#tag) and a time interval). + +##### fill_option +
+ +Any numerical value +              + Reports the given numerical value for time intervals with no data. + +`linear` +              +Reports the results of [linear interpolation](https://en.wikipedia.org/wiki/Linear_interpolation) for time intervals with no data. + +`none` +              +              +           +Reports no timestamp and no value for time intervals with no data. + +`null` +              +              +           +Reports null for time intervals with no data but returns a timestamp. This is the same as the default behavior. + +`previous` +              +              +    +Reports the value from the previous time interval for time intervals with no data. + +#### Examples + +{{< tabs-wrapper >}} +{{% tabs %}} +[Example 1: fill(100)](#) +[Example 2: fill(linear)](#) +[Example 3: fill(none)](#) +[Example 4: fill(null)](#) +[Example 5: fill(previous)](#) +{{% /tabs %}} +{{% tab-content %}} + +Without `fill(100)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z +``` + +With `fill(100)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(100) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z 100 +``` + +`fill(100)` changes the value reported for the time interval with no data to `100`. + +{{% /tab-content %}} + +{{% tab-content %}} + +Without `fill(linear)`: + +``` +> SELECT MEAN("tadpoles") FROM "pond" WHERE time >= '2016-11-11T21:00:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m) + +name: pond +time mean +---- ---- +2016-11-11T21:00:00Z 1 +2016-11-11T21:12:00Z +2016-11-11T21:24:00Z 3 +2016-11-11T21:36:00Z +2016-11-11T21:48:00Z +2016-11-11T22:00:00Z 6 +``` + +With `fill(linear)`: +``` +> SELECT MEAN("tadpoles") FROM "pond" WHERE time >= '2016-11-11T21:00:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear) + +name: pond +time mean +---- ---- +2016-11-11T21:00:00Z 1 +2016-11-11T21:12:00Z 2 +2016-11-11T21:24:00Z 3 +2016-11-11T21:36:00Z 4 +2016-11-11T21:48:00Z 5 +2016-11-11T22:00:00Z 6 +``` + +`fill(linear)` changes the value reported for the time interval with no data +to the results of [linear interpolation](https://en.wikipedia.org/wiki/Linear_interpolation). + +> **Note:** The data in Example 2 are not in `NOAA_water_database`. +We had to create a dataset with less regular data to work with `fill(linear)`. + +{{% /tab-content %}} + +{{% tab-content %}} + +Without `fill(none)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z +``` + +With `fill(none)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(none) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +``` + +`fill(none)` reports no value and no timestamp for the time interval with no data. + +{{% /tab-content %}} + +{{% tab-content %}} + +Without `fill(null)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z +``` + +With `fill(null)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(null) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z +``` + +`fill(null)` reports `null` as the value for the time interval with no data. +That result matches the result of the query without `fill(null)`. + +{{% /tab-content %}} + +{{% tab-content %}} + +Without `fill(previous)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z +``` + +With `fill(previous)`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-09-18T16:00:00Z' AND time <= '2015-09-18T16:42:00Z' GROUP BY time(12m) fill(previous) + +name: h2o_feet +-------------- +time max +2015-09-18T16:00:00Z 3.599 +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z 3.235 +``` + +`fill(previous)` changes the value reported for the time interval with no data to `3.235`, +the value from the previous time interval. + +{{% /tab-content %}} +{{< /tabs-wrapper >}} + +#### Common issues with `fill()` + +##### Issue 1: `fill()` when no data fall within the query's time range +
+Currently, queries ignore `fill()` if no data fall within the query's time range. +This is the expected behavior. An open +[feature request](https://github.com/influxdata/influxdb/issues/6967) on GitHub +proposes that `fill()` should force a return of values even if the query's time +range covers no data. + +**Example** + +The following query returns no data because `water_level` has no points within +the query's time range. +Note that `fill(800)` has no effect on the query results. +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location" = 'coyote_creek' AND time >= '2015-09-18T22:00:00Z' AND time <= '2015-09-18T22:18:00Z' GROUP BY time(12m) fill(800) +> +``` + +##### Issue 2: `fill(previous)` when the previous result falls outside the query's time range +
+`fill(previous)` doesn’t fill the result for a time interval if the previous +value is outside the query’s time range. + +**Example** + +The following query covers the time range between `2015-09-18T16:24:00Z` and `2015-09-18T16:54:00Z`. +Note that `fill(previous)` fills the result for `2015-09-18T16:36:00Z` with the +result from `2015-09-18T16:24:00Z`. +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE location = 'coyote_creek' AND time >= '2015-09-18T16:24:00Z' AND time <= '2015-09-18T16:54:00Z' GROUP BY time(12m) fill(previous) + +name: h2o_feet +-------------- +time max +2015-09-18T16:24:00Z 3.235 +2015-09-18T16:36:00Z 3.235 +2015-09-18T16:48:00Z 4 +``` + +The next query shortens the time range in the previous query. +It now covers the time between `2015-09-18T16:36:00Z` and `2015-09-18T16:54:00Z`. +Note that `fill(previous)` doesn't fill the result for `2015-09-18T16:36:00Z` with the +result from `2015-09-18T16:24:00Z`; the result for `2015-09-18T16:24:00Z` is outside the query's +shorter time range. + +``` +> SELECT MAX("water_level") FROM "h2o_feet" WHERE location = 'coyote_creek' AND time >= '2015-09-18T16:36:00Z' AND time <= '2015-09-18T16:54:00Z' GROUP BY time(12m) fill(previous) + +name: h2o_feet +-------------- +time max +2015-09-18T16:36:00Z +2015-09-18T16:48:00Z 4 +``` + +##### Issue 3: `fill(linear)` when the previous or following result falls outside the query's time range +
+`fill(linear)` doesn't fill the result for a time interval with no data if the +previous result or the following result is outside the query's time range. + +**Example** + +The following query covers the time range between `2016-11-11T21:24:00Z` and +`2016-11-11T22:06:00Z`. Note that `fill(linear)` fills the results for the +`2016-11-11T21:36:00Z` time interval and the `2016-11-11T21:48:00Z` time interval +using the values from the `2016-11-11T21:24:00Z` time interval and the +`2016-11-11T22:00:00Z` time interval. + +``` +> SELECT MEAN("tadpoles") FROM "pond" WHERE time > '2016-11-11T21:24:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear) + +name: pond +time mean +---- ---- +2016-11-11T21:24:00Z 3 +2016-11-11T21:36:00Z 4 +2016-11-11T21:48:00Z 5 +2016-11-11T22:00:00Z 6 +``` + +The next query shortens the time range in the previous query. +It now covers the time between `2016-11-11T21:36:00Z` and `2016-11-11T22:06:00Z`. +Note that `fill()` previous doesn't fill the results for the `2016-11-11T21:36:00Z` +time interval and the `2016-11-11T21:48:00Z` time interval; the result for +`2016-11-11T21:24:00Z` is outside the query's shorter time range and InfluxDB +cannot perform the linear interpolation. + +``` +> SELECT MEAN("tadpoles") FROM "pond" WHERE time >= '2016-11-11T21:36:00Z' AND time <= '2016-11-11T22:06:00Z' GROUP BY time(12m) fill(linear) +name: pond +time mean +---- ---- +2016-11-11T21:36:00Z +2016-11-11T21:48:00Z +2016-11-11T22:00:00Z 6 +``` + +> **Note:** The data in Issue 3 are not in `NOAA_water_database`. +We had to create a dataset with less regular data to work with `fill(linear)`. + +
+
+# The INTO clause + +The `INTO` clause writes query results to a user-specified [measurement](/influxdb/v1.3/concepts/glossary/#measurement). + +### Syntax +``` +SELECT_clause INTO FROM_clause [WHERE_clause] [GROUP_BY_clause] +``` + +### Description of Syntax + +The `INTO` clause supports several formats for specifying a [measurement](/influxdb/v1.3/concepts/glossary/#measurement): + +`INTO ` +           +Writes data to the specified measurement. +If you're using the [CLI](/influxdb/v1.3/tools/shell/) InfluxDB writes the data to the measurement in the +[`USE`d](/influxdb/v1.3/tools/shell/#commands) +[database](/influxdb/v1.3/concepts/glossary/#database) and the `DEFAULT` [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp). +If you're using the [HTTP API](/influxdb/v1.3/tools/api/) InfluxDB writes the data to the +measurement in the database specified in the [`db` query string parameter](/influxdb/v1.3/tools/api/#query-string-parameters) +and the `DEFAULT` retention policy. + +`INTO ..` +           +Writes data to a fully qualified measurement. +Fully qualify a measurement by specifying its database and retention policy. + +`INTO ..` +           +Writes data to a measurement in a user-specified database and the `DEFAULT` +retention policy. + +`INTO ..:MEASUREMENT FROM //` +           +Writes data to all measurements in the user-specified database and +retention policy that match the [regular expression](#regular-expressions) in the `FROM` clause. +`:MEASUREMENT` is a backreference to each measurement matched in the `FROM` clause. + +### Examples + +#### Example 1: Rename a database + +``` +> SELECT * INTO "copy_NOAA_water_database"."autogen".:MEASUREMENT FROM "NOAA_water_database"."autogen"./.*/ GROUP BY * + +name: result +time written +---- ------- +0 76290 +``` + +Directly renaming a database in InfluxDB is not possible, so a common use for the `INTO` clause is to move data from one database to another. +The query above writes all data in the `NOAA_water_database` and `autogen` retention policy to the `copy_NOAA_water_database` database and the `autogen` retention policy. + +The [backreference](#example-5-write-aggregated-results-for-more-than-one-measurement-to-a-different-database-downsampling-with-backreferencing) syntax (`:MEASUREMENT`) maintains the source measurement names in the destination database. +Note that both the `copy_NOAA_water_database` database and its `autogen` retention policy must exist prior to running the `INTO` query. +See [Database Management](/influxdb/v1.3/query_language/database_management/) +for how to manage databases and retention policies. + +The `GROUP BY *` clause [preserves tags](#issue-1-missing-data) in the source database as tags in the destination database. +The following query does not maintain the series context for tags; tags will be stored as fields in the destination database (`copy_NOAA_water_database`): + +``` +SELECT * INTO "copy_NOAA_water_database"."autogen".:MEASUREMENT FROM "NOAA_water_database"."autogen"./.*/ +``` + +When moving large amounts of data, we recommend sequentially running `INTO` queries for different measurements and using time boundaries in the [`WHERE` clause](#time-syntax). +This prevents your system from running out of memory. +The codeblock below provides sample syntax for those queries: + +``` +SELECT * +INTO .. +FROM .. +WHERE time > now() - 100w and time < now() - 90w GROUP BY * + +SELECT * +INTO .. +FROM ..} +WHERE time > now() - 90w and time < now() - 80w GROUP BY * + +SELECT * +INTO .. +FROM .. +WHERE time > now() - 80w and time < now() - 70w GROUP BY * +``` + +#### Example 2: Write the results of a query to a measurement + +``` +> SELECT "water_level" INTO "h2o_feet_copy_1" FROM "h2o_feet" WHERE "location" = 'coyote_creek' + +name: result +------------ +time written +1970-01-01T00:00:00Z 7604 + +> SELECT * FROM "h2o_feet_copy_1" + +name: h2o_feet_copy_1 +--------------------- +time water_level +2015-08-18T00:00:00Z 8.12 +[...] +2015-09-18T16:48:00Z 4 +``` + +The query writes its results a new [measurement](/influxdb/v1.3/concepts/glossary/#measurement): `h2o_feet_copy_1`. +If you're using the [CLI](/influxdb/v1.3/tools/shell/), InfluxDB writes the data to +the `USE`d [database](/influxdb/v1.3/concepts/glossary/#database) and the `DEFAULT` [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp). +If you're using the [HTTP API](/influxdb/v1.3/tools/api/), InfluxDB writes the +data to the database and retention policy specified in the `db` and `rp` +[query string parameters](/influxdb/v1.3/tools/api/#query-string-parameters). +If you do not set the `rp` query string parameter, the HTTP API automatically +writes the data to the database's `DEFAULT` retention policy. + +The response shows the number of points (`7605`) that InfluxDB writes to `h2o_feet_copy_1`. +The timestamp in the response is meaningless; InfluxDB uses epoch 0 +(`1970-01-01T00:00:00Z`) as a null timestamp equivalent. + +#### Example 3: Write the results of a query to a fully qualified measurement + +``` +> SELECT "water_level" INTO "where_else"."autogen"."h2o_feet_copy_2" FROM "h2o_feet" WHERE "location" = 'coyote_creek' + +name: result +------------ +time written +1970-01-01T00:00:00Z 7604 + +> SELECT * FROM "where_else"."autogen"."h2o_feet_copy_2" + +name: h2o_feet_copy_2 +--------------------- +time water_level +2015-08-18T00:00:00Z 8.12 +[...] +2015-09-18T16:48:00Z 4 +``` + +The query writes its results to a new measurement: `h2o_feet_copy_2`. +InfluxDB writes the data to the `where_else` database and to the `autogen` +retention policy. +Note that both `where_else` and `autogen` must exist prior to running the `INTO` +query. +See [Database Management](/influxdb/v1.3/query_language/database_management/) +for how to manage databases and retetion policies. + +The response shows the number of points (`7605`) that InfluxDB writes to `h2o_feet_copy_2`. +The timestamp in the response is meaningless; InfluxDB uses epoch 0 +(`1970-01-01T00:00:00Z`) as a null timestamp equivalent. + +#### Example 4: Write aggregated results to a measurement (downsampling) + +``` +> SELECT MEAN("water_level") INTO "all_my_averages" FROM "h2o_feet" WHERE "location" = 'coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) + +name: result +------------ +time written +1970-01-01T00:00:00Z 3 + +> SELECT * FROM "all_my_averages" + +name: all_my_averages +--------------------- +time mean +2015-08-18T00:00:00Z 8.0625 +2015-08-18T00:12:00Z 7.8245 +2015-08-18T00:24:00Z 7.5675 +``` + +The query aggregates data using an +InfluxQL [function](/influxdb/v1.3/query_language/functions) and a [`GROUP BY +time()` clause](#group-by-time-intervals). +It also writes its results to the `all_my_averages` measurement. + +The response shows the number of points (`3`) that InfluxDB writes to `all_my_averages`. +The timestamp in the response is meaningless; InfluxDB uses epoch 0 +(`1970-01-01T00:00:00Z`) as a null timestamp equivalent. + +The query is an example of downsampling: taking higher precision data, +aggregating those data to a lower precision, and storing the lower precision +data in the database. +Downsampling is a common use case for the `INTO` clause. + +#### Example 5: Write aggregated results for more than one measurement to a different database (downsampling with backreferencing) + +``` +> SELECT MEAN(*) INTO "where_else"."autogen".:MEASUREMENT FROM /.*/ WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:06:00Z' GROUP BY time(12m) + +name: result +time written +---- ------- +1970-01-01T00:00:00Z 5 + +> SELECT * FROM "where_else"."autogen"./.*/ + +name: average_temperature +time mean_degrees mean_index mean_pH mean_water_level +---- ------------ ---------- ------- ---------------- +2015-08-18T00:00:00Z 78.5 + +name: h2o_feet +time mean_degrees mean_index mean_pH mean_water_level +---- ------------ ---------- ------- ---------------- +2015-08-18T00:00:00Z 5.07625 + +name: h2o_pH +time mean_degrees mean_index mean_pH mean_water_level +---- ------------ ---------- ------- ---------------- +2015-08-18T00:00:00Z 6.75 + +name: h2o_quality +time mean_degrees mean_index mean_pH mean_water_level +---- ------------ ---------- ------- ---------------- +2015-08-18T00:00:00Z 51.75 + +name: h2o_temperature +time mean_degrees mean_index mean_pH mean_water_level +---- ------------ ---------- ------- ---------------- +2015-08-18T00:00:00Z 63.75 +``` + +The query aggregates data using an +InfluxQL [function](/influxdb/v1.3/query_language/functions) and a [`GROUP BY +time()` clause](#group-by-time-intervals). +It aggregates data in every measurement that matches the [regular expression](#regular-expressions) +in the `FROM` clause and writes the results to measurements with the same name in the +`where_else` database and the `autogen` retention policy. +Note that both `where_else` and `autogen` must exist prior to running the `INTO` +query. +See [Database Management](/influxdb/v1.3/query_language/database_management/) +for how to manage databases and retention policies. + +The response shows the number of points (`5`) that InfluxDB writes to the `where_else` +database and the `autogen` retention policy. +The timestamp in the response is meaningless; InfluxDB uses epoch 0 +(`1970-01-01T00:00:00Z`) as a null timestamp equivalent. + +The query is an example of downsampling with backreferencing. +It takes higher precision data from more than one measurement, +aggregates those data to a lower precision, and stores the lower precision +data in the database. +Downsampling with backreferencing is a common use case for the `INTO` clause. + +### Common Issues with the `INTO` clause + +#### Issue 1: Missing data + +If an `INTO` query includes a [tag key](/influxdb/v1.3/concepts/glossary#tag-key) in the [`SELECT` clause](#the-basic-select-statement), the query converts [tags](/influxdb/v1.3/concepts/glossary#tag) in the current +measurement to [fields](/influxdb/v1.3/concepts/glossary#field) in the destination measurement. +This can cause InfluxDB to overwrite [points](/influxdb/v1.3/concepts/glossary#point) that were previously differentiated +by a [tag value](/influxdb/v1.3/concepts/glossary#tag-value). +Note that this behavior does not apply to queries that use the [`TOP()`](/influxdb/v1.3/query_language/functions/#top) or [`BOTTOM()`](/influxdb/v1.3/query_language/functions/#bottom) functions. +The +[Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#why-are-my-into-queries-missing-data) +document describes that behavior in detail. + +To preserve tags in the current measurement as tags in the destination measurement, +[`GROUP BY` the relevant tag key](#group-by-tags) or `GROUP BY *` in the `INTO` query. + +#### Issue 2: Automating queries with the `INTO` clause + +The `INTO` clause section in this document shows how to manually implement +queries with an `INTO` clause. +See the [Continuous Queries](/influxdb/v1.3/query_language/continuous_queries/) +documentation for how to automate `INTO` clause queries on realtime data. +Among [other uses](/influxdb/v1.3/query_language/continuous_queries/#continuous-query-use-cases), +Continuous Queries automate the downsampling process. + +
+
+# ORDER BY time DESC +By default, InfluxDB returns results in ascending time order; the first [point](/influxdb/v1.3/concepts/glossary/#point) +returned has the oldest [timestamp](/influxdb/v1.3/concepts/glossary/#timestamp) and +the last point returned has the most recent timestamp. +`ORDER BY time DESC` reverses that order such that InfluxDB returns the points +with the most recent timestamps first. + +### Syntax +``` +SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] ORDER BY time DESC +``` + +### Description of Syntax + +`ORDER by time DESC` must appear after the [`GROUP BY` clause](#the-group-by-clause) +if the query includes a `GROUP BY` clause. +`ORDER by time DESC` must appear after the [`WHERE` clause](#the-where-clause) +if the query includes a `WHERE` clause and no `GROUP BY` clause. + +### Examples + +#### Example 1: Return the newest points first + +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' ORDER BY time DESC + +name: h2o_feet +time water_level +---- ----------- +2015-09-18T21:42:00Z 4.938 +2015-09-18T21:36:00Z 5.066 +[...] +2015-08-18T00:06:00Z 2.116 +2015-08-18T00:00:00Z 2.064 +``` + +The query returns the points with the most recent timestamps from the +`h2o_feet` [measurement](/influxdb/v1.3/concepts/glossary/#measurement) first. +Without `ORDER by time DESC`, the query would return `2015-08-18T00:00:00Z` +first and `2015-09-18T21:42:00Z` last. + +#### Example 2: Return the newest points first and include a GROUP BY time() clause +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:42:00Z' GROUP BY time(12m) ORDER BY time DESC + +name: h2o_feet +time mean +---- ---- +2015-08-18T00:36:00Z 4.6825 +2015-08-18T00:24:00Z 4.80675 +2015-08-18T00:12:00Z 4.950749999999999 +2015-08-18T00:00:00Z 5.07625 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions) +and a time interval in the [GROUP BY clause](#group-by-time-intervals) +to calculate the average `water_level` for each twelve-minute +interval in the query's time range. +`ORDER BY time DESC` returns the most recent 12-minute time intervals +first. + +Without `ORDER BY time DESC`, the query would return +`2015-08-18T00:00:00Z` first and `2015-08-18T00:36:00Z` last. + +
+
+# The LIMIT and SLIMIT clauses + +`LIMIT` and `SLIMIT` limit the number of +[points](/influxdb/v1.3/concepts/glossary/#point) and the number of +[series](/influxdb/v1.3/concepts/glossary/#series) returned per query. + +## The LIMIT Clause +`LIMIT ` returns the first `N` [points](/influxdb/v1.3/concepts/glossary/#point) from the specified [measurement](/influxdb/v1.3/concepts/glossary/#measurement). + +### Syntax +``` +SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] LIMIT +``` + +### Description of Syntax + +`N` specifies the number of [points](/influxdb/v1.3/concepts/glossary/#point) to return from the specified [measurement](/influxdb/v1.3/concepts/glossary/#measurement). +If `N` is greater than the number of points in a measurement, InfluxDB returns +all points from that series. + +Note that the `LIMIT` clause must appear in the order outlined in the syntax above. + +### Examples + +#### Example 1: Limit the number of points returned +``` +> SELECT "water_level","location" FROM "h2o_feet" LIMIT 3 + +name: h2o_feet +time water_level location +---- ----------- -------- +2015-08-18T00:00:00Z 8.12 coyote_creek +2015-08-18T00:00:00Z 2.064 santa_monica +2015-08-18T00:06:00Z 8.005 coyote_creek +``` + +The query returns the three oldest [points](/influxdb/v1.3/concepts/glossary/#point) (determined by timestamp) from the +`h2o_feet` [measurement](/influxdb/v1.3/concepts/glossary/#measurement). + +#### Example 2: Limit the number points returned and include a GROUP BY clause +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:42:00Z' GROUP BY *,time(12m) LIMIT 2 + +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +2015-08-18T00:00:00Z 8.0625 +2015-08-18T00:12:00Z 7.8245 + +name: h2o_feet +tags: location=santa_monica +time mean +---- ---- +2015-08-18T00:00:00Z 2.09 +2015-08-18T00:12:00Z 2.077 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions) +and a [GROUP BY clause](#group-by-time-intervals) +to calculate the average `water_level` for each [tag](/influxdb/v1.3/concepts/glossary/#tag) and for each twelve-minute +interval in the query's time range. +`LIMIT 2` requests the two oldest twelve-minute averages (determined by timestamp). + +Note that without `LIMIT 2`, the query would return four points per [series](/influxdb/v1.3/concepts/glossary/#series); +one for each twelve-minute interval in the query's time range. + +## The `SLIMIT` Clause +`SLIMIT ` returns every [point](/influxdb/v1.3/concepts/glossary/#point) from \ [series](/influxdb/v1.3/concepts/glossary/#series) in the specified [measurement](/influxdb/v1.3/concepts/glossary/#measurement). + +### Syntax +``` +SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] GROUP BY *[,time()] [ORDER_BY_clause] SLIMIT +``` + +### Description of Syntax +`N` specifies the number of [series](/influxdb/v1.3/concepts/glossary/#series) to return from the specified [measurement](/influxdb/v1.3/concepts/glossary/#measurement). +If `N` is greater than the number of series in a measurement, InfluxDB returns +all series from that measurement. + +There is an [ongoing issue](https://github.com/influxdata/influxdb/issues/7571) that requires queries with `SLIMIT` to include `GROUP BY *`. +Note that the `SLIMIT` clause must appear in the order outlined in the syntax above. + +### Examples + +#### Example 1: Limit the number of series returned +``` +> SELECT "water_level" FROM "h2o_feet" GROUP BY * SLIMIT 1 + +name: h2o_feet +tags: location=coyote_creek +time water_level +---- ----- +2015-08-18T00:00:00Z 8.12 +2015-08-18T00:06:00Z 8.005 +2015-08-18T00:12:00Z 7.887 +[...] +2015-09-18T16:12:00Z 3.402 +2015-09-18T16:18:00Z 3.314 +2015-09-18T16:24:00Z 3.235 +``` + +The query returns all `water_level` [points](/influxdb/v1.3/concepts/glossary/#points-per-second) from one of the [series](/influxdb/v1.3/concepts/glossary/#series) associated +with the `h2o_feet` [measurement](/influxdb/v1.3/concepts/glossary/#measurement). + +#### Example 2: Limit the number of series returned and include a GROUP BY time() clause +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:42:00Z' GROUP BY *,time(12m) SLIMIT 1 + +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +2015-08-18T00:00:00Z 8.0625 +2015-08-18T00:12:00Z 7.8245 +2015-08-18T00:24:00Z 7.5675 +2015-08-18T00:36:00Z 7.303 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions) +and a time interval in the [GROUP BY clause](#group-by-time-intervals) +to calculate the average `water_level` for each twelve-minute +interval in the query's time range. +`SLIMIT 1` requests a single series +associated with the `h2o_feet` measurement. + +Note that without `SLIMIT 1`, the query would return results for the two series +associated with the `h2o_feet` measurement: `location=coyote_creek` and +`location=santa_monica`. + +## LIMIT and SLIMIT +`LIMIT ` followed by `SLIMIT ` returns the first \ [points](/influxdb/v1.3/concepts/glossary/#point) from \ [series](/influxdb/v1.3/concepts/glossary/#series) in the specified measurement. + +### Syntax +``` +SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] GROUP BY *[,time()] [ORDER_BY_clause] LIMIT SLIMIT +``` + +### Description of Syntax + +`N1` specifies the number of [points](/influxdb/v1.3/concepts/glossary/#point) to return per [measurement](/influxdb/v1.3/concepts/glossary/#measurement). +If `N1` is greater than the number of points in a measurement, InfluxDB returns all points from that measurement. + +`N2` specifies the number of series to return from the specified [measurement](/influxdb/v1.3/concepts/glossary/#measurement). +If `N2` is greater than the number of series in a measurement, InfluxDB returns all series from that measurement. + +There is an [ongoing issue](https://github.com/influxdata/influxdb/issues/7571) that requires queries with `LIMIT` and `SLIMIT` to include `GROUP BY *`. +Note that the `LIMIT` and `SLIMIT` clauses must appear in the order outlined in the syntax above. + +### Examples + +#### Example 1: Limit the number of points and series returned +``` +> SELECT "water_level" FROM "h2o_feet" GROUP BY * LIMIT 3 SLIMIT 1 + +name: h2o_feet +tags: location=coyote_creek +time water_level +---- ----------- +2015-08-18T00:00:00Z 8.12 +2015-08-18T00:06:00Z 8.005 +2015-08-18T00:12:00Z 7.887 +``` + +The query returns the three oldest [points](/influxdb/v1.3/concepts/glossary/#point) (determined by timestamp) from one +of the [series](/influxdb/v1.3/concepts/glossary/#series) associated with the +[measurement](/influxdb/v1.3/concepts/glossary/#measurement) `h2o_feet`. + +#### Example 2: Limit the number of points and series returned and include a GROUP BY time() clause +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:42:00Z' GROUP BY *,time(12m) LIMIT 2 SLIMIT 1 + +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +2015-08-18T00:00:00Z 8.0625 +2015-08-18T00:12:00Z 7.8245 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions) +and a time interval in the [GROUP BY clause](#group-by-time-intervals) +to calculate the average `water_level` for each twelve-minute +interval in the query's time range. +`LIMIT 2` requests the two oldest twelve-minute averages (determined by +timestamp) and `SLIMIT 1` requests a single series +associated with the `h2o_feet` measurement. + +Note that without `LIMIT 2 SLIMIT 1`, the query would return four points +for each of the two series associated with the `h2o_feet` measurement. + +
+
+# The OFFSET and SOFFSET Clauses +`OFFSET` and `SOFFSET` paginates [points](/influxdb/v1.3/concepts/glossary/#point) and [series](/influxdb/v1.3/concepts/glossary/#series) returned. + + + + + + +
The OFFSET clauseThe SOFFSET clause
+ +## The `OFFSET` clause +`OFFSET ` paginates `N` [points](/influxdb/v1.3/concepts/glossary/#point) in the query results. + +### Syntax +``` +SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] LIMIT_clause OFFSET [SLIMIT_clause] +``` + +### Description of Syntax +`N` specifies the number of [points](/influxdb/v1.3/concepts/glossary/#point) to paginate. +The `OFFSET` clause requires a [`LIMIT` clause](#the-limit-clause). +Using the `OFFSET` clause without a `LIMIT` clause can cause [inconsistent +query results](https://github.com/influxdata/influxdb/issues/7577). + +> **Note:** InfluxDB returns no results if the `WHERE` clause includes a time +range and the `OFFSET` clause would cause InfluxDB to return points with +timestamps outside of that time range. + +### Examples + +#### Example 1: Paginate points +``` +> SELECT "water_level","location" FROM "h2o_feet" LIMIT 3 OFFSET 3 + +name: h2o_feet +time water_level location +---- ----------- -------- +2015-08-18T00:06:00Z 2.116 santa_monica +2015-08-18T00:12:00Z 7.887 coyote_creek +2015-08-18T00:12:00Z 2.028 santa_monica +``` + +The query returns the fourth, fifth, and sixth [points](/influxdb/v1.3/concepts/glossary/#point) from the `h2o_feet` [measurement](/influxdb/v1.3/concepts/glossary/#measurement). +If the query did not include `OFFSET 3`, it would return the first, second, +and third points from that measurement. + +#### Example 2: Paginate points and include several clauses +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:42:00Z' GROUP BY *,time(12m) ORDER BY time DESC LIMIT 2 OFFSET 2 SLIMIT 1 + +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +2015-08-18T00:12:00Z 7.8245 +2015-08-18T00:00:00Z 8.0625 +``` + +This example is pretty involved, so here's the clause-by-clause breakdown: + +The [`SELECT` clause](#the-basic-select-statement) specifies an InfluxQL [function](/influxdb/v1.3/query_language/functions). +The [`FROM` clause](#the-basic-select-statement) specifies a single measurement. +The [`WHERE` clause](#the-where-clause) specifies the time range for the query. +The [`GROUP BY` clause](#the-group-by-clause) groups results by all tags (`*`) and into 12-minute intervals. +The [`ORDER BY time DESC` clause](#order-by-time-desc) returns results in descending timestamp order. +The [`LIMIT 2` clause](#the-limit-clause) limits the number of points returned to two. +The `OFFSET 2` clause excludes the first two averages from the query results. +The [`SLIMIT 1` clause](#the-slimit-clause) limits the number of series returned to one. + +Without `OFFSET 2`, the query would return the first two averages of the query results: +``` +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +2015-08-18T00:36:00Z 7.303 +2015-08-18T00:24:00Z 7.5675 +``` + +## The `SOFFSET` clause +`SOFFSET ` paginates `N` [series](/influxdb/v1.3/concepts/glossary/#series) in the query results. + +### Syntax +``` +SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] GROUP BY *[,time(time_interval)] [ORDER_BY_clause] [LIMIT_clause] [OFFSET_clause] SLIMIT_clause SOFFSET +``` + +### Description of Syntax +`N` specifies the number of [series](/influxdb/v1.3/concepts/glossary/#series) to paginate. +The `SOFFSET` clause requires an [`SLIMIT` clause](#the-slimit-clause). +Using the `SOFFSET` clause without an `SLIMIT` clause can cause [inconsistent +query results](https://github.com/influxdata/influxdb/issues/7578). +There is an [ongoing issue](https://github.com/influxdata/influxdb/issues/7571) that requires queries with `SLIMIT` to include `GROUP BY *`. + +> **Note:** InfluxDB returns no results if the `SOFFSET` clause paginates +through more than the total number of series. + +### Examples + +#### Example 1: Paginate series +``` +> SELECT "water_level" FROM "h2o_feet" GROUP BY * SLIMIT 1 SOFFSET 1 + +name: h2o_feet +tags: location=santa_monica +time water_level +---- ----------- +2015-08-18T00:00:00Z 2.064 +2015-08-18T00:06:00Z 2.116 +[...] +2015-09-18T21:36:00Z 5.066 +2015-09-18T21:42:00Z 4.938 +``` + +The query returns data for the [series](/influxdb/v1.3/concepts/glossary/#series) associated with the `h2o_feet` +[measurement](/influxdb/v1.3/concepts/glossary/#measurement) and the `location = santa_monica` [tag](/influxdb/v1.3/concepts/glossary/#tag). +Without `SOFFSET 1`, the query returns data for the series associated with the +`h2o_feet` measurement and the `location = coyote_creek` tag. + +#### Example 2: Paginate series and include all clauses +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:42:00Z' GROUP BY *,time(12m) ORDER BY time DESC LIMIT 2 OFFSET 2 SLIMIT 1 SOFFSET 1 + +name: h2o_feet +tags: location=santa_monica +time mean +---- ---- +2015-08-18T00:12:00Z 2.077 +2015-08-18T00:00:00Z 2.09 +``` + +This example is pretty involved, so here's the clause-by-clause breakdown: + +The [`SELECT` clause](#the-basic-select-statement) specifies an InfluxQL [function](/influxdb/v1.3/query_language/functions). +The [`FROM` clause](#the-basic-select-statement) specifies a single measurement. +The [`WHERE` clause](#the-where-clause) specifies the time range for the query. +The [`GROUP BY` clause](#the-group-by-clause) groups results by all tags (`*`) and into 12-minute intervals. +The [`ORDER BY time DESC` clause](#order-by-time-desc) returns results in descending timestamp order. +The [`LIMIT 2` clause](#the-limit-clause) limits the number of points returned to two. +The [`OFFSET 2` clause](#the-offset-clause) excludes the first two averages from the query results. +The [`SLIMIT 1` clause](#the-slimit-clause) limits the number of series returned to one. +The `SOFFSET 1` clause paginates the series returned. + +Without `SOFFSET 1`, the query would return the results for a different series: +``` +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +2015-08-18T00:12:00Z 7.8245 +2015-08-18T00:00:00Z 8.0625 +``` + +
+
+# The Time Zone Clause + +The `tz()` clause returns the UTC offset for the specified timezone. + +### Syntax + +``` +SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] [LIMIT_clause] [OFFSET_clause] [SLIMIT_clause] [SOFFSET_clause] tz('') +``` + +### Description of Syntax + +By default, InfluxDB stores and returns timestamps in UTC. +The `tz()` clause includes the UTC offset or, if applicable, the UTC Daylight Savings Time (DST) offset to the query's returned timestamps. +The returned timestamps must be in [RFC3339 format](/influxdb/v1.3/query_language/data_exploration/#issue-3-configuring-the-returned-timestamps) for the UTC offset or UTC DST to appear. +The `time_zone` parameter follows the TZ syntax in the [Internet Assigned Numbers Authority time zone database](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List) and it requires single quotes. + +### Examples + +#### Example 1: Return the UTC offset for Chicago's time zone +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:18:00Z' tz('America/Chicago') + +name: h2o_feet +time water_level +---- ----------- +2015-08-17T19:00:00-05:00 2.064 +2015-08-17T19:06:00-05:00 2.116 +2015-08-17T19:12:00-05:00 2.028 +2015-08-17T19:18:00-05:00 2.126 +``` + +The query results include the UTC offset (`-05:00`) for the `America/Chicago` time zone in the timestamps. + +
+
+# Time Syntax + +For most `SELECT` statements, the default time range is between [`1677-09-21 00:12:43.145224194` and `2262-04-11T23:47:16.854775806Z` UTC](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#what-are-the-minimum-and-maximum-timestamps-that-influxdb-can-store). +For `SELECT` statements with a [`GROUP BY time()` clause](#group-by-time-intervals), the default time +range is between `1677-09-21 00:12:43.145224194` UTC and [`now()`](/influxdb/v1.3/concepts/glossary/#now). +The following sections detail how to specify alternative time ranges in the `SELECT` +statement's [`WHERE` clause](#the-where-clause). + + + + + + + +
Absolute TimeRelative TimeCommon Issues with Time Syntax
+ +Tired of reading? Check out this InfluxQL Short: +
+
+ + +## Absolute Time + +Specify absolute time with date-time strings and epoch time. + +### Syntax +``` +SELECT_clause FROM_clause WHERE time ['' | '' | ] [AND ['' | '' | ] [...]] +``` + +### Description of Syntax + +#### Supported operators + +`=`   equal to +`<>` not equal to +`!=` not equal to +`>`   greater than +`>=` greater than or equal to +`<`   less than +`<=` less than or equal to + +Currently, InfluxDB does not support using `OR` with absolute time in the `WHERE` +clause. See the [Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#why-is-my-query-with-a-where-or-time-clause-returning-empty-results) +document and the [GitHub Issue](https://github.com/influxdata/influxdb/issues/7530) +for more information. + +#### rfc3339_date_time_string + +``` +'YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ' +``` + +`.nnnnnnnnn` is optional and is set to `.000000000` if not included. +The [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) date-time string requires single quotes. + +#### rfc3339_like_date_time_string + +``` +'YYYY-MM-DD HH:MM:SS.nnnnnnnnn' +``` + +`HH:MM:SS.nnnnnnnnn.nnnnnnnnn` is optional and is set to `00:00:00.000000000` if not included. +The RFC3339-like date-time string requires single quotes. + +#### epoch_time + +Epoch time is the amount of time that has elapsed since 00:00:00 +Coordinated Universal Time (UTC), Thursday, 1 January 1970. + +By default, InfluxDB assumes that all epoch timestamps are in nanoseconds. +Include a [duration literal](/influxdb/v1.3/query_language/spec/#durations) +at the end of the epoch timestamp to indicate a precision other than nanoseconds. + +#### Basic Arithmetic + +All timestamp formats support basic arithmetic. +Add (`+`) or subtract (`-`) a time from a timestamp with a [duration literal](/influxdb/v1.3/query_language/spec/#durations). +Note that InfluxQL requires a whitespace between the `+` or `-` and the +duration literal. + +### Examples + +#### Example 1: Specify a time range with RFC3339 date-time strings +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= '2015-08-18T00:00:00.000000000Z' AND time <= '2015-08-18T00:12:00Z' + +name: h2o_feet +time water_level +---- ----------- +2015-08-18T00:00:00Z 2.064 +2015-08-18T00:06:00Z 2.116 +2015-08-18T00:12:00Z 2.028 +``` + +The query returns data with timestamps between August 18, 2015 at 00:00:00.000000000 and +August 18, 2015 at 00:12:00. +The nanosecond specification in the first timestamp (`.000000000`) +is optional. + +Note that the single quotes around the RFC3339 date-time strings are required. + +#### Example 2: Specify a time range with RFC3339-like date-time strings + +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= '2015-08-18' AND time <= '2015-08-18 00:12:00' + +name: h2o_feet +time water_level +---- ----------- +2015-08-18T00:00:00Z 2.064 +2015-08-18T00:06:00Z 2.116 +2015-08-18T00:12:00Z 2.028 +``` + +The query returns data with timestamps between August 18, 2015 at 00:00:00 and August 18, 2015 +at 00:12:00. +The first date-time string does not include a time; InfluxDB assumes the time +is 00:00:00. + +Note that the single quotes around the RFC3339-like date-time strings are +required. + + +#### Example 3: Specify a time range with epoch timestamps +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= 1439856000000000000 AND time <= 1439856720000000000 + +name: h2o_feet +time water_level +---- ----------- +2015-08-18T00:00:00Z 2.064 +2015-08-18T00:06:00Z 2.116 +2015-08-18T00:12:00Z 2.028 +``` + +The query returns data with timestamps that occur between August 18, 2015 +at 00:00:00 and August 18, 2015 at 00:12:00. +By default InfluxDB assumes epoch timestamps are in nanoseconds. + +#### Example 4: Specify a time range with second-precision epoch timestamps +``` +> SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= 1439856000s AND time <= 1439856720s + +name: h2o_feet +time water_level +---- ----------- +2015-08-18T00:00:00Z 2.064 +2015-08-18T00:06:00Z 2.116 +2015-08-18T00:12:00Z 2.028 +``` + +The query returns data with timestamps that occur between August 18, 2015 +at 00:00:00 and August 18, 2015 at 00:12:00. +The `s` [duration literal](/influxdb/v1.3/query_language/spec/#durations) at the +end of the epoch timestamps indicate that the epoch timestamps are in seconds. + +#### Example 5: Perform basic arithmetic on an RFC3339-like date-time string +``` +> SELECT "water_level" FROM "h2o_feet" WHERE time > '2015-09-18T21:24:00Z' + 6m + +name: h2o_feet +time water_level +---- ----------- +2015-09-18T21:36:00Z 5.066 +2015-09-18T21:42:00Z 4.938 +``` + +The query returns data with timestamps that occur at least six minutes after +September 18, 2015 at 21:24:00. +Note that the whitespace between the `+` and `6m` is required. + +#### Example 6: Perform basic arithmetic on an epoch timestamp + +``` +> SELECT "water_level" FROM "h2o_feet" WHERE time > 24043524m - 6m + +name: h2o_feet +time water_level +---- ----------- +2015-09-18T21:24:00Z 5.013 +2015-09-18T21:30:00Z 5.01 +2015-09-18T21:36:00Z 5.066 +2015-09-18T21:42:00Z 4.938 +``` + +The query returns data with timestamps that occur at least six minutes before +September 18, 2015 at 21:24:00. +Note that the whitespace between the `-` and `6m` is required. + +## Relative time +Use [`now()`](/influxdb/v1.3/concepts/glossary/#now) to query data with [timestamps](/influxdb/v1.3/concepts/glossary/#timestamp) relative to the server's current timestamp. + +### Syntax +``` +SELECT_clause FROM_clause WHERE time now() [[ - | + ] ] [(AND|OR) now() [...]] +``` + +### Description of Syntax + +`now()` is the Unix time of the server at the time the query is executed on that server. +The whitespace between `-` or `+` and the [duration literal](/influxdb/v1.3/query_language/spec/#durations) is required. + +#### Supported operators + +`=`   equal to +`<>` not equal to +`!=` not equal to +`>`   greater than +`>=` greater than or equal to +`<`   less than +`<=` less than or equal to + +#### duration_literal + +`u` or `µ` microseconds +`ms`       milliseconds +`s`      seconds +`m`      minutes +`h`      hours +`d`      days +`w`      weeks + +### Examples + +#### Example 1: Specify a time range with relative time +``` +> SELECT "water_level" FROM "h2o_feet" WHERE time > now() - 1h +``` + +The query returns data with timestamps that occur within the past hour. +The whitespace between `-` and `1h` is required. + +#### Example 2: Specify a time range with absolute time and relative time +``` +> SELECT "level description" FROM "h2o_feet" WHERE time > '2015-09-18T21:18:00Z' AND time < now() + 1000d + +name: h2o_feet +time level description +---- ----------------- +2015-09-18T21:24:00Z between 3 and 6 feet +2015-09-18T21:30:00Z between 3 and 6 feet +2015-09-18T21:36:00Z between 3 and 6 feet +2015-09-18T21:42:00Z between 3 and 6 feet +``` + +The query returns data with timestamps that occur between September 18, 2015 +at 21:18:00 and 1000 days from `now()`. +The whitespace between `+` and `1000d` is required. + +## Common Issues with Time Syntax + +### Issue 1: Using `OR` with absolute time + +Currently, InfluxDB does not support using `OR` with absolute time +in the `WHERE` clause. + +See the [Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#why-is-my-query-with-a-where-or-time-clause-returning-empty-results) +document for more information. + +### Issue 2: Querying data that occur after `now()` with a `GROUP BY time()` clause + +Most `SELECT` statements have a default time range between [`1677-09-21 00:12:43.145224194` and `2262-04-11T23:47:16.854775806Z` UTC](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#what-are-the-minimum-and-maximum-timestamps-that-influxdb-can-store). +For `SELECT` statements with a [`GROUP BY time()` clause](#group-by-time-intervals), the default time +range is between `1677-09-21 00:12:43.145224194` UTC and [`now()`](/influxdb/v1.3/concepts/glossary/#now). + +To query data with timestamps that occur after `now()`, `SELECT` statements with +a `GROUP BY time()` clause must provide an alternative upper bound in the +`WHERE` clause. + +#### Example + +Use the [CLI](/influxdb/v1.3/tools/shell/) to write a point to the `NOAA_water_database` that occurs after `now()`: +``` +> INSERT h2o_feet,location=santa_monica water_level=3.1 1587074400000000000 +``` + +Run a `GROUP BY time()` query that covers data with timestamps between +`2015-09-18T21:30:00Z` and `now()`: +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='santa_monica' AND time >= '2015-09-18T21:30:00Z' GROUP BY time(12m) fill(none) + +name: h2o_feet +time mean +---- ---- +2015-09-18T21:24:00Z 5.01 +2015-09-18T21:36:00Z 5.002 +``` + +Run a `GROUP BY time()` query that covers data with timestamps between +`2015-09-18T21:30:00Z` and 180 weeks from `now()`: +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='santa_monica' AND time >= '2015-09-18T21:30:00Z' AND time <= now() + 180w GROUP BY time(12m) fill(none) + +name: h2o_feet +time mean +---- ---- +2015-09-18T21:24:00Z 5.01 +2015-09-18T21:36:00Z 5.002 +2020-04-16T22:00:00Z 3.1 +``` + +Note that the `WHERE` clause must provide an alternative **upper** bound to +override the default `now()` upper bound. The following query merely resets +the lower bound to `now()` such that the query's time range is between +`now()` and `now()`: +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location"='santa_monica' AND time >= now() GROUP BY time(12m) fill(none) +> +``` + +### Issue 3: Configuring the returned timestamps + +The [CLI](/influxdb/v1.3/tools/shell/) returns timestamps in +nanosecond epoch format by default. +Specify alternative formats with the +[`precision ` command](/influxdb/v1.3/tools/shell/#influx-commands). +The [HTTP API](/influxdb/v1.3/tools/api/) returns timestamps +in [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) format by default. +Specify alternative formats with the +[`epoch` query string parameter](/influxdb/v1.3/tools/api/#query-string-parameters). + +
+
+# Regular Expressions + +InfluxQL supports using regular expressions when specifying: + +* [field keys](/influxdb/v1.3/concepts/glossary/#field-key) and [tag keys](/influxdb/v1.3/concepts/glossary/#tag-key) in the [`SELECT` clause](#the-basic-select-statement) +* [measurements](/influxdb/v1.3/concepts/glossary/#measurement) in the [`FROM` clause](#the-basic-select-statement) +* [tag values](/influxdb/v1.3/concepts/glossary/#tag-value) and string [field values](/influxdb/v1.3/concepts/glossary/#field-value) in the [`WHERE` clause](#the-where-clause). +* [tag keys](/influxdb/v1.3/concepts/glossary/#tag-key) in the [`GROUP BY` clause](#group-by-tags) + +Currently, InfluxQL does not support using regular expressions to match +non-string field values in the +`WHERE` clause, +[databases](/influxdb/v1.3/concepts/glossary/#database), and +[retention polices](/influxdb/v1.3/concepts/glossary/#retention-policy-rp). + +> **Note:** Regular expression comparisons are more computationally intensive than exact +string comparisons; queries with regular expressions are not as performant +as those without. + +### Syntax +``` +SELECT // FROM // WHERE [ // | //] GROUP BY // +``` + +### Description of Syntax + +Regular expressions are surrounded by `/` characters and use +[Golang's regular expression syntax](http://golang.org/pkg/regexp/syntax/). + +Supported operators: +`=~` matches against +`!~` doesn't match against + +### Examples + +#### Example 1: Use a regular expression to specify field keys and tag keys in the SELECT clause +``` +> SELECT /l/ FROM "h2o_feet" LIMIT 1 + +name: h2o_feet +time level description location water_level +---- ----------------- -------- ----------- +2015-08-18T00:00:00Z between 6 and 9 feet coyote_creek 8.12 +``` + +The query selects all [field keys](/influxdb/v1.3/concepts/glossary/#field-key) +and [tag keys](/influxdb/v1.3/concepts/glossary/#tag-key) that include an `l`. +Note that the regular expression in the `SELECT` clause must match at least one +field key in order to return results for a tag key that matches the regular +expression. + +Currently, there is no syntax to distinguish between regular expressions for +field keys and regular expressions for tag keys in the `SELECT` clause. +The syntax `//::[field | tag]` is not supported. + +#### Example 2: Use a regular expression to specify field keys with a function in the SELECT clause +``` +> SELECT DISTINCT(/level/) FROM "h2o_feet" WHERE "location" = 'santa_monica' AND time >= '2015-08-18T00:00:00.000000000Z' AND time <= '2015-08-18T00:12:00Z' + +name: h2o_feet +time distinct_level description distinct_water_level +---- -------------------------- -------------------- +2015-08-18T00:00:00Z below 3 feet 2.064 +2015-08-18T00:00:00Z 2.116 +2015-08-18T00:00:00Z 2.028 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to return the distinct [field values](/influxdb/v1.3/concepts/glossary/#field-value) +for every field key that contains the word `level`. + +#### Example 3: Use a regular expression to specify measurements in the FROM clause +``` +> SELECT MEAN("degrees") FROM /temperature/ + +name: average_temperature +time mean +---- ---- +1970-01-01T00:00:00Z 79.98472932232272 + +name: h2o_temperature +time mean +---- ---- +1970-01-01T00:00:00Z 64.98872722506226 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `degrees` for every [measurement](/influxdb/v1.3/concepts/glossary#measurement) in the `NOAA_water_database` +[database](/influxdb/v1.3/concepts/glossary#database) that contains the word `temperature`. + +#### Example 4: Use a regular expression to specify tag values in the WHERE clause + +``` +> SELECT MEAN(water_level) FROM "h2o_feet" WHERE "location" =~ /[m]/ AND "water_level" > 3 + +name: h2o_feet +time mean +---- ---- +1970-01-01T00:00:00Z 4.47155532049926 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `water_level` where the [tag value](/influxdb/v1.3/concepts/glossary#tag-value) of `location` +includes an `m` and `water_level` is greater than three. + +#### Example 5: Use a regular expression to specify a tag with no value in the WHERE clause + +``` +> SELECT * FROM "h2o_feet" WHERE "location" !~ /./ +> +``` + +The query selects all data from the `h2o_feet` measurement where the `location` +[tag](/influxdb/v1.3/concepts/glossary#tag) has no value. +Every data [point](/influxdb/v1.3/concepts/glossary#point) in the `NOAA_water_database` has a tag value for `location`. + +It's possible to perform this same query without a regular expression. +See the +[Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-do-i-select-data-with-a-tag-that-has-no-value) +document for more information. + +#### Example 6: Use a regular expression to specify a tag with a value in the WHERE clause + +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location" =~ /./ + +name: h2o_feet +time mean +---- ---- +1970-01-01T00:00:00Z 4.442107025822523 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `water_level` across all data that have a tag value for +`location`. + +#### Example 7: Use a regular expression to specify a field value in the WHERE clause +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location" = 'santa_monica' AND "level description" =~ /between/ + +name: h2o_feet +time mean +---- ---- +1970-01-01T00:00:00Z 4.47155532049926 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to calculate the average `water_level` for all data where the field value of +`level description` includes the word `between`. + +#### Example 8: Use a regular expresssion to specify tag keys in the GROUP BY clause +``` +> SELECT FIRST("index") FROM "h2o_quality" GROUP BY /l/ + +name: h2o_quality +tags: location=coyote_creek +time first +---- ----- +2015-08-18T00:00:00Z 41 + +name: h2o_quality +tags: location=santa_monica +time first +---- ----- +2015-08-18T00:00:00Z 99 +``` + +The query uses an InfluxQL [function](/influxdb/v1.3/query_language/functions/) +to select the first value of `index` for every tag that includes the letter `l` +in its tag key. + +
+
+# Data Types and Cast Operations + +The [`SELECT` clause](#the-basic-select-statement) supports specifying a [field's](/influxdb/v1.3/concepts/glossary/#field) type and basic cast +operations with the `::` syntax. + + + + + + +
Data TypesCast Operations
+ +## Data types + +[Field values](/influxdb/v1.3/concepts/glossary/#field-value) can be floats, integers, strings, or booleans. +The `::` syntax allows users to specify the field's type in a query. + +> **Note:** Generally, it is not necessary to specify the field value +type in the [`SELECT` clause](#the-basic-select-statement). +In most cases, InfluxDB rejects any writes that attempt to write a [field value](/influxdb/v1.3/concepts/glossary/#field-value) +to a field that previously accepted field values of a different type. +> +It is possible for field value types to differ across [shard groups](/influxdb/v1.3/concepts/glossary/#shard-group). +In these cases, it may be necessary to specify the field value type in the +`SELECT` clause. +Please see the +[Frequently Asked Questions](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-field-type-discrepancies-across-shards) +document for more information on how InfluxDB handles field value type discrepancies. + +### Syntax +``` +SELECT_clause :: FROM_clause +``` + +### Description of syntax. + +`type` can be `float`, `integer`, `string`, or `boolean`. +In most cases, InfluxDB returns no data if the `field_key` does not store data of the specified +`type`. See [Cast Operations](#cast-operations) for more information. + +### Example +``` +> SELECT "water_level"::float FROM "h2o_feet" LIMIT 4 + +name: h2o_feet +-------------- +time water_level +2015-08-18T00:00:00Z 8.12 +2015-08-18T00:00:00Z 2.064 +2015-08-18T00:06:00Z 8.005 +2015-08-18T00:06:00Z 2.116 +``` + +The query returns values of the `water_level` field key that are floats. + +## Cast Operations + +The `::` syntax allows users to perform basic cast operations in queries. +Currently, InfluxDB supports casting [field values](/influxdb/v1.3/concepts/glossary/#field-value) from integers to +floats or from floats to integers. + +### Syntax +``` +SELECT_clause :: FROM_clause +``` + +### Description of Syntax + +`type` can be `float` or `integer`. + +InfluxDB returns no data if the query attempts to cast an integer or float to a +string or boolean. + +### Examples + +#### Example 1: Cast float field values to integers + +``` +> SELECT "water_level"::integer FROM "h2o_feet" LIMIT 4 + +name: h2o_feet +-------------- +time water_level +2015-08-18T00:00:00Z 8 +2015-08-18T00:00:00Z 2 +2015-08-18T00:06:00Z 8 +2015-08-18T00:06:00Z 2 +``` + +The query returns the integer form of `water_level`'s float [field values](/influxdb/v1.3/concepts/glossary/#field-value). + +#### Example 2: Cast float field values to strings (this functionality is not supported) + +``` +> SELECT "water_level"::string FROM "h2o_feet" LIMIT 4 +> +``` + +The query returns no data as casting a float field value to a string is not +yet supported. + +
+
+# Merge Behavior +In InfluxDB, queries merge [series](/influxdb/v1.3/concepts/glossary/#series) +automatically. + +### Example + +The `h2o_feet` [measurement](/influxdb/v1.3/concepts/glossary/#measurement) in the `NOAA_water_database` is part of two [series](/influxdb/v1.3/concepts/glossary/#series). +The first series is made up of the `h2o_feet` measurement and the `location = coyote_creek` [tag](/influxdb/v1.3/concepts/glossary/#tag). +The second series is made of up the `h2o_feet` measurement and the `location = santa_monica` tag. + +The following query automatically merges those two series when it calculates the [average](/influxdb/v1.3/query_language/functions/#mean) `water_level`: + +``` +> SELECT MEAN("water_level") FROM "h2o_feet" + +name: h2o_feet +-------------- +time mean +1970-01-01T00:00:00Z 4.442107025822521 +``` + +If you want the average `water_level` for the first series only, specify the relevant tag in the [`WHERE` clause](#the-where-clause): +``` +> SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location" = 'coyote_creek' + +name: h2o_feet +-------------- +time mean +1970-01-01T00:00:00Z 5.359342451341401 +``` + +If you want the average `water_level` for each individual series, include a [`GROUP BY` clause](#group-by-tags): + +``` +> SELECT MEAN("water_level") FROM "h2o_feet" GROUP BY "location" + +name: h2o_feet +tags: location=coyote_creek +time mean +---- ---- +1970-01-01T00:00:00Z 5.359342451341401 + +name: h2o_feet +tags: location=santa_monica +time mean +---- ---- +1970-01-01T00:00:00Z 3.530863470081006 +``` + +
+
+# Multiple Statements +Separate multiple [`SELECT` statements](#the-basic-select-statement) in a query with a semicolon (`;`). + +### Examples: + +{{< tabs-wrapper >}} +{{% tabs %}} +[Example 1: CLI](#) +[Example 2: HTTP API](#) +{{% /tabs %}} +{{% tab-content %}} + +In InfluxDB's [CLI](/influxdb/v1.3/tools/shell/): + +``` +> SELECT MEAN("water_level") FROM "h2o_feet"; SELECT "water_level" FROM "h2o_feet" LIMIT 2 + +name: h2o_feet +time mean +---- ---- +1970-01-01T00:00:00Z 4.442107025822522 + +name: h2o_feet +time water_level +---- ----------- +2015-08-18T00:00:00Z 8.12 +2015-08-18T00:00:00Z 2.064 +``` + +{{% /tab-content %}} + +{{% tab-content %}} + +With InfluxDB's [HTTP API](/influxdb/v1.3/tools/api/): + +``` +{ + "results": [ + { + "statement_id": 0, + "series": [ + { + "name": "h2o_feet", + "columns": [ + "time", + "mean" + ], + "values": [ + [ + "1970-01-01T00:00:00Z", + 4.442107025822522 + ] + ] + } + ] + }, + { + "statement_id": 1, + "series": [ + { + "name": "h2o_feet", + "columns": [ + "time", + "water_level" + ], + "values": [ + [ + "2015-08-18T00:00:00Z", + 8.12 + ], + [ + "2015-08-18T00:00:00Z", + 2.064 + ] + ] + } + ] + } + ] +} +``` + +{{% /tab-content %}} +{{< /tabs-wrapper >}} + +
+
+# Subqueries + +A subquery is a query that is nested in the `FROM` clause of another query. +Use a subquery to apply a query as a condition in the enclosing query. +Subqueries offer functionality similar to nested functions and SQL +[`HAVING` clauses](https://en.wikipedia.org/wiki/Having_(SQL\)). + +### Syntax +``` +SELECT_clause FROM ( SELECT_statement ) [...] +``` + +### Description of Syntax +InfluxDB performs the subquery first and the main query second. + +The main query surrounds the subquery and requires at least the [`SELECT` clause](#the-basic-select-statement) and the [`FROM` clause](#the-basic-select-statement). +The main query supports all clauses listed in this document. + +The subquery appears in the main query's `FROM` clause, and it requires surrounding parentheses. +The subquery supports all clauses listed in this document. + +InfluxQL supports multiple nested subqueries per main query. +Sample syntax for multiple subqueries: + +``` +SELECT_clause FROM ( SELECT_clause FROM ( SELECT_statement ) [...] ) [...] +``` + +### Examples + +#### Example 1: Calculate the [`SUM()`](/influxdb/v1.3/query_language/functions/#sum) of several [`MAX()`](/influxdb/v1.3/query_language/functions/#max) values +``` +> SELECT SUM("max") FROM (SELECT MAX("water_level") FROM "h2o_feet" GROUP BY "location") + +name: h2o_feet +time sum +---- --- +1970-01-01T00:00:00Z 17.169 +``` + +The query returns the sum of the maximum `water_level` values across every tag value of `location`. + +InfluxDB first performs the subquery; it calculates the maximum value of `water_level` for each tag value of `location`: +``` +> SELECT MAX("water_level") FROM "h2o_feet" GROUP BY "location" +name: h2o_feet + +tags: location=coyote_creek +time max +---- --- +2015-08-29T07:24:00Z 9.964 + +name: h2o_feet +tags: location=santa_monica +time max +---- --- +2015-08-29T03:54:00Z 7.205 +``` + +Next, InfluxDB performs the main query and calculates the sum of those maximum values: `9.964` + `7.205` = `17.169`. +Notice that the main query specifies `max`, not `water_level`, as the field key in the `SUM()` function. + +#### Example 2: Calculate the [`MEAN()`](/influxdb/v1.3/query_language/functions/#mean) difference between two fields +``` +> SELECT MEAN("difference") FROM (SELECT "cats" - "dogs" AS "difference" FROM "pet_daycare") + +name: pet_daycare +time mean +---- ---- +1970-01-01T00:00:00Z 1.75 +``` + +The query returns the average of the differences between the number of `cats` and `dogs` in the `pet_daycare` measurement. + +InfluxDB first performs the subquery. +The subquery calculates the difference between the values in the `cats` field and the values in the `dogs` field, +and it names the output column `difference`: +``` +> SELECT "cats" - "dogs" AS "difference" FROM "pet_daycare" + +name: pet_daycare +time difference +---- ---------- +2017-01-20T00:55:56Z -1 +2017-01-21T00:55:56Z -49 +2017-01-22T00:55:56Z 66 +2017-01-23T00:55:56Z -9 +``` + +Next, InfluxDB performs the main query and calculates the average of those differences. +Notice that the main query specifies `difference` as the field key in the `MEAN()` function. + +#### Example 3: Calculate several [`MEAN()`](/influxdb/v1.3/query_language/functions/#mean) values and place a condition on those mean values +``` +> SELECT "all_the_means" FROM (SELECT MEAN("water_level") AS "all_the_means" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) ) WHERE "all_the_means" > 5 + +name: h2o_feet +time all_the_means +---- ------------- +2015-08-18T00:00:00Z 5.07625 +``` + +The query returns all mean values of the `water_level` field that are greater than five. + +InfluxDB first performs the subquery. +The subquery calculates `MEAN()` values of `water_level` from `2015-08-18T00:00:00Z` through `2015-08-18T00:30:00Z` and groups the results into 12-minute intervals. +It also names the output column `all_the_means`: +``` +> SELECT MEAN("water_level") AS "all_the_means" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) + +name: h2o_feet +time all_the_means +---- ------------- +2015-08-18T00:00:00Z 5.07625 +2015-08-18T00:12:00Z 4.950749999999999 +2015-08-18T00:24:00Z 4.80675 +``` + +Next, InfluxDB performs the main query and returns only those mean values that are greater than five. +Notice that the main query specifies `all_the_means` as the field key in the `SELECT` clause. + +#### Example 4: Calculate the [`SUM()`](/influxdb/v1.3/query_language/functions/#sum) of several [`DERIVATIVE()`](/influxdb/v1.3/query_language/functions/#derivative) values +``` +> SELECT SUM("water_level_derivative") AS "sum_derivative" FROM (SELECT DERIVATIVE(MEAN("water_level")) AS "water_level_derivative" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m),"location") GROUP BY "location" + +name: h2o_feet +tags: location=coyote_creek +time sum_derivative +---- -------------- +1970-01-01T00:00:00Z -0.4950000000000001 + +name: h2o_feet +tags: location=santa_monica +time sum_derivative +---- -------------- +1970-01-01T00:00:00Z -0.043999999999999595 +``` + +The query returns the sum of the derivative of average `water_level` values for each tag value of `location`. + +InfluxDB first performs the subquery. +The subquery calculates the derivative of average `water_level` values taken at 12-minute intervals. +It performs that calculation for each tag value of `location` and names the output column `water_level_derivative`: +``` +> SELECT DERIVATIVE(MEAN("water_level")) AS "water_level_derivative" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m),"location" + +name: h2o_feet +tags: location=coyote_creek +time water_level_derivative +---- ---------------------- +2015-08-18T00:12:00Z -0.23800000000000043 +2015-08-18T00:24:00Z -0.2569999999999997 + +name: h2o_feet +tags: location=santa_monica +time water_level_derivative +---- ---------------------- +2015-08-18T00:12:00Z -0.0129999999999999 +2015-08-18T00:24:00Z -0.030999999999999694 +``` + +Next, InfluxDB performs the main query and calculates the sum of the `water_level_derivative` values for each tag value of `location`. +Notice that the main query specifies `water_level_derivative`, not `water_level` or `derivative`, as the field key in the `SUM()` function. + +### Common Issues with Subqueries + +#### Issue 1: Multiple SELECT statements in a subquery + +InfluxQL supports multiple nested subqueries per main query: +``` +SELECT_clause FROM ( SELECT_clause FROM ( SELECT_statement ) [...] ) [...] + ------------------ ---------------- + Subquery 1 Subquery 2 +``` + +InfluxQL does not support multiple [`SELECT` statements](#the-basic-select-statement) per subquery: +``` +SELECT_clause FROM (SELECT_statement; SELECT_statement) [...] +``` +The system returns a parsing error if a subquery includes multiple `SELECT` statements. diff --git a/content/influxdb/v1.3/query_language/database_management.md b/content/influxdb/v1.3/query_language/database_management.md new file mode 100644 index 000000000..7c9c525d7 --- /dev/null +++ b/content/influxdb/v1.3/query_language/database_management.md @@ -0,0 +1,354 @@ +--- +title: Database Management + +menu: + influxdb_1_3: + weight: 30 + parent: influxql +--- + +InfluxQL offers a full suite of administrative commands. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Data Management:Retention Policy Management:
CREATE DATABASECREATE RETENTION POLICY
DROP DATABASEALTER RETENTION POLICY
DROP SERIESDROP RETENTION POLICY
DELETE
DROP MEASUREMENT
DROP SHARD
+ +If you're looking for `SHOW` queries (for example, `SHOW DATABASES` or `SHOW RETENTION POLICIES`), see [Schema Exploration](/influxdb/v1.3/query_language/schema_exploration). + +The examples in the sections below use InfluxDB's [Command Line Interface (CLI)](/influxdb/v1.3/introduction/getting_started/). +You can also execute the commands using the HTTP API; simply send a `GET` request to the `/query` endpoint and include the command in the URL parameter `q`. +See the [Querying Data](/influxdb/v1.3/guides/querying_data/) guide for more on using the HTTP API. + +> **Note:** When authentication is enabled, only admin users can execute most of the commands listed on this page. +See the documentation on [authentication and authorization](/influxdb/v1.3/query_language/authentication_and_authorization/) for more information. + +## Data Management + +### CREATE DATABASE + +Creates a new database. + +#### Syntax +```sql +CREATE DATABASE [WITH [DURATION ] [REPLICATION ] [SHARD DURATION ] [NAME ]] +``` + +#### Description of Syntax + +`CREATE DATABASE` requires a database [name](/influxdb/v1.3/troubleshooting/frequently-asked-questions/#what-words-and-characters-should-i-avoid-when-writing-data-to-influxdb). + +The `WITH`, `DURATION`, `REPLICATION`, `SHARD DURATION`, and `NAME` clauses are optional and create a single [retention policy](/influxdb/v1.3/concepts/glossary/#retention-policy-rp) associated with the created database. +If you do not specify one of the clauses after `WITH`, the relevant behavior defaults to the `autogen` retention policy settings. +The created retention policy automatically serves as the database's default retention policy. +For more information about those clauses, see [Retention Policy Management](/influxdb/v1.3/query_language/database_management/#retention-policy-management). + +A successful `CREATE DATABASE` query returns an empty result. +If you attempt to create a database that already exists, InfluxDB does nothing and does not return an error. + +#### Examples + +##### Example 1: Create a database +
+``` +> CREATE DATABASE "NOAA_water_database" +> +``` + +The query creates a database called `NOAA_water_database`. +[By default](/influxdb/v1.3/administration/config/#retention-autocreate-true), InfluxDB also creates the `autogen` retention policy and associates it with the `NOAA_water_database`. + +##### Example 2: Create a database with a specific retention policy +
+``` +> CREATE DATABASE "NOAA_water_database" WITH DURATION 3d REPLICATION 1 SHARD DURATION 1h NAME "liquid" +> +``` + +The query creates a database called `NOAA_water_database`. +It also creates a default retention policy for `NOAA_water_database` with a `DURATION` of three days, a [replication factor](/influxdb/v1.3/concepts/glossary/#replication-factor) of one, a [shard group](/influxdb/v1.3/concepts/glossary/#shard-group) duration of one hour, and with the name `liquid`. + +### Delete a database with DROP DATABASE + +The `DROP DATABASE` query deletes all of the data, measurements, series, continuous queries, and retention policies from the specified database. +The query takes the following form: +```sql +DROP DATABASE +``` + +Drop the database NOAA_water_database: +```bash +> DROP DATABASE "NOAA_water_database" +> +``` + +A successful `DROP DATABASE` query returns an empty result. +If you attempt to drop a database that does not exist, InfluxDB does not return an error. + +### Drop series from the index with DROP SERIES + +The `DROP SERIES` query deletes all points from a [series](/influxdb/v1.3/concepts/glossary/#series) in a database, +and it drops the series from the index. + +> **Note:** `DROP SERIES` does not support time intervals in the `WHERE` clause. +See +[`DELETE`](/influxdb/v1.3/query_language/database_management/#delete-series-with-delete) +for that functionality. + +The query takes the following form, where you must specify either the `FROM` clause or the `WHERE` clause: +```sql +DROP SERIES FROM WHERE ='' +``` + +Drop all series from a single measurement: +```sql +> DROP SERIES FROM "h2o_feet" +``` + +Drop series with a specific tag pair from a single measurement: +```sql +> DROP SERIES FROM "h2o_feet" WHERE "location" = 'santa_monica' +``` + +Drop all points in the series that have a specific tag pair from all measurements in the database: +```sql +> DROP SERIES WHERE "location" = 'santa_monica' +``` + +A successful `DROP SERIES` query returns an empty result. + +### Delete series with DELETE + +The `DELETE` query deletes all points from a +[series](/influxdb/v1.3/concepts/glossary/#series) in a database. +Unlike +[`DROP SERIES`](/influxdb/v1.3/query_language/database_management/#drop-series-from-the-index-with-drop-series), it does not drop the series from the index and it supports time intervals +in the `WHERE` clause. + +The query takes the following form where you must include either the `FROM` +clause or the `WHERE` clause, or both: + +``` +DELETE FROM WHERE [=''] | [