Commit Graph

1073 Commits (29b19a22934008d7f0dfed8589aa58240afabff4)

Author SHA1 Message Date
Ben Johnson 525e22c92b
tsm1 query engine alloc reduction
This commit makes a number of performance improvements to
reduce allocations during query execution. Several objects
and buffers are now reused across the components to avoid
allocations.

Previously a simple `count(value)` query across 1M points
would require 26,000+ allocations. After the changes in
this commit that number has been reduced to 88.
2016-04-11 14:50:59 -06:00
Jonathan A. Sternberg 815c49aa5f Integration test to check for empty tag behavior 2016-04-11 14:00:08 -04:00
Jonathan A. Sternberg 6040375370 Modify cluster.TSDBStore interface to take a slice of meta.ShardInfo
The tsdb package can't have a dependency on the meta package so it takes
a slice of uint64 types. The clustering implementation needs the full
ShardInfo to know the shard owners though, so a different implementation
needs to be used by clustering.

The `*tsdb.Store` type gets wrapped in the cluster package so it can
implement the `IteratorCreator` function without having a dependency on
the meta package.
2016-04-05 16:40:32 -04:00
Todd Persen 05acce6645 Clarify the upgrade error message. 2016-04-04 15:28:35 -07:00
Jonathan A. Sternberg 37b63cedec Cleanup QueryExecutor and split statement execution code
The QueryExecutor had a lot of dead code made obsolete by the query
engine refactor that has now been removed. The TSDBStore interface has
also been cleaned up so we can have multiple implementations of this
(such as a local and remote version).

A StatementExecutor interface has been created for adding custom
functionality to the QueryExecutor that may not be available in the open
source version. The QueryExecutor delegate all statement execution to
the StatementExecutor and the QueryExecutor will only keep track of
housekeeping. Implementing additional queries is as simple as wrapping
the cluster.StatementExecutor struct or replacing it with something
completely different.

The PointsWriter in the QueryExecutor has been changed to a simple
interface that implements the one method needed by the query executor.
This is to allow different PointsWriter implementations to be used by
the QueryExecutor. It has also been moved into the StatementExecutor
instead.

The TSDBStore interface has now been modified to contain the code for
creating an IteratorCreator. This is so the underlying TSDBStore can
implement different ways of accessing the underlying shards rather than
always having to access each shard individually (such as batch
requests).

Remove the show servers handling. This isn't a valid command in the open
source version of InfluxDB anymore.

The QueryManager interface is now built into QueryExecutor and is no
longer necessary. The StatementExecutor and QueryExecutor split allows
task management to much more easily be built into QueryExecutor rather
than as a separate struct.
2016-04-04 13:27:17 -04:00
Cory LaNou 9bd60c3e07 remove node.json as well, change where we check for raft.db in startup 2016-03-31 17:15:58 -05:00
Cory LaNou a961ff9ebf minor restore fixes; fsync meta snapshots 2016-03-31 17:15:42 -05:00
Edd Robinson 9cd0bc65f5 Let SHARD DURATION be specified in isolation
Fixed #6152.
2016-03-31 17:42:50 +01:00
Jonathan A. Sternberg ad7480e64b Limit bucket count in selection
Fixes #6078.
2016-03-30 22:57:09 -04:00
Jonathan A. Sternberg 178a6e2f0a Merge pull request #6113 from influxdata/js-6112-simple-moving-average
Implement simple moving average
2016-03-30 20:57:55 -04:00
Jonathan A. Sternberg 711a6614e6 Implement the point limit monitor
Fixes #6077.
2016-03-30 16:08:56 -04:00
Jonathan A. Sternberg 6453dbc249 Implement simple moving average
The simple moving average will gradually emit points instead of waiting
until the end. This should apply to derivative and difference in the
future too.

Fixes #6112.
2016-03-29 14:36:43 -04:00
Jonathan A. Sternberg c1643e69c1 Have the server kill all queries on shutdown
Related to #6140, but won't actually fix that problem. It will correctly
stop new queries from being started during shutdown and will send the
interrupt signal to queries during shutdown.

Since the interrupt signal is asynchronous, there isn't currently a way
to wait for the queries to complete themselves before shutting down the
engine.
2016-03-29 11:48:21 -04:00
Jonathan A. Sternberg a86632912f Fix the difference test
A recent bugfix to CREATE RETENTION POLICY caused this to fail when
merged. This fixes the test.
2016-03-29 10:03:43 -04:00
Jonathan A. Sternberg 9ddc59aab5 Merge pull request #6105 from influxdata/js-1825-difference-function
Implement the difference function
2016-03-29 09:37:59 -04:00
Jonathan A. Sternberg a9720f926e Implement the difference function
The difference function is implemented very similar to how derivative is
implemented. It is an aggregate function that acts over the entire
aggregate. This function will also have the same problems that
derivative has with getting values from the previous interval or point.
This will be fixed separately as part of #5943.

Fixes #1825.
2016-03-29 09:27:12 -04:00
Edd Robinson adffbc2ba0 Fix tests to not clash with retention policy 2016-03-29 11:27:58 +01:00
Tait Clarridge 45b3e61ac7 Add configurable shard duration to retention policies
Allows configuration of shard group duration at database creation, and retention
policy create/alter time.

Query examples:

```
CREATE DATABASE testdb WITH DURATION 90d SHARD DURATION 30m NAME rp_testdb
CREATE RETENTION POLICY rp_testdb2 ON testdb DURATION INF REPLICATION 1 SHARD DURATION 30m
ALTER RETENTION POLICY rp_testdb2 ON testdb SHARD DURATION 1h
```

This can be useful with long duration retention policies with lots of data, where
you can split into smaller shards to relieve memory pressure.
2016-03-24 00:25:49 -04:00
Ben Johnson a6d9930b6f limit series count in selection
This commit adds a configurable limit to the number of series that
can be returned from a `SELECT` statement. The limit is checked
immediately after planning and is determined by the use of iterator
stats.

Fixes #6076
2016-03-23 12:48:48 -06:00
Jonathan A. Sternberg 79fe4490c2 Support a timeout for running queries in the query manager
Include an interrupt iterator at the top level to interrupt the fill
iterator if it is producing too many points.

Fixes #6075.
2016-03-22 13:30:40 -04:00
Jonathan A. Sternberg a35d9602cd Fix where filters when a OR is used and when a tag does not exist
If an OR was used, merging filters between different expressions would
not work correctly. If one of the sides had a set of series ids with a
condition and the other side had no series ids associated with the
expression, all of the series from the side with a condition would have
the condition ignored. Instead of defaulting a non-existant series
filter to true, it should just be false and the evaluation of the one
side that does exist should take care of determining if the series id
should be included or not. The AND condition used false correctly so did
not have to be changed.

If a tag did not exist and `!=` or `!~` were used, it would return false
even though the neither a field or a tag equaled those values. This has
now been modified to correctly return the correct series ids and the
correct condition.

Also fixed a panic that would occur when a tag caused a field access to
become unnecessary. The filter using the field access still got created
and used even though it was unnecessary, resulting in an attempted
access to a non-initialized map.

Fixes #5152 and a bunch of other miscellaneous issues.
2016-03-22 12:19:06 -04:00
Jonathan A. Sternberg 8ab1a9b513 Merge pull request #6083 from influxdata/js-6079-limit-max-concurrent-queries
Limit the maximum number of concurrent queries
2016-03-22 12:08:36 -04:00
Jason Wilder 7857e07a1e Merge pull request #6062 from influxdata/mr-prune-wal-config
Remove unused WAL configuration variables/fields
2016-03-22 09:20:27 -06:00
Jonathan A. Sternberg abae1cfed0 Limit the maximum number of concurrent queries
Fixes #6079.
2016-03-21 22:34:27 -04:00
Jonathan A. Sternberg 117f62c33e Implement a simple task manager for queries
The currently running queries can be listed with the command
`SHOW QUERIES` and it will display the current commands that have been
run, the database they were run against, and how long they have been
running.
2016-03-21 12:06:06 -04:00
Mark Rushakoff 7a2adfcc5d Remove unused WAL configuration variables/fields
These were all b1/bz1 settings that no longer have any effect:

- {Default,}MaxWALSize
- {Default,}WALFlushInterval
- {Default,}WALPartitionFlushDelay
- {Default,WAL}ReadySeriesSize
- {Default,WAL}CompactionThreshold
- {Default,WAL}MaxSeriesSize
- {Default,WAL}FlushColdInterval
- {Default,WAL}PartitionSizeThreshold
2016-03-20 13:16:52 -07:00
Jonathan A. Sternberg 43a5e84aaf Merge pull request #6047 from influxdata/js-6040-boolean-distinct
Support the distinct() call for booleans
2016-03-17 17:17:21 -04:00
Jonathan A. Sternberg e47426ff6e Support integer literals in the query language
Numbers in the query without any decimal will now be emitted as integers
instead and be parsed as an IntegerLiteral. This ensures we keep the
original context that a query was issued with and allows us to act more
similar to how programming languages are typically structured when it
comes to floats and ints.

This adds functionality for dealing with integers promoting to floats in
the various different places where math are used.

Fixes #5744 and #5629.
2016-03-17 10:37:34 -04:00
Jonathan A. Sternberg 2e7816ebd9 Support the distinct() call for booleans
Normalize the time for the distinct() call to either be at the beginning
of the group by interval or the start time similar to every other call.
The timestamp previously just showed the first time found and didn't
make a lot of sense in the context of what the function was supposed to
do.

Fixes #6040.
2016-03-17 09:32:54 -04:00
Gunnar c6ff2588d1 Merge pull request #6025 from influxdata/ga-remove-json
Remove deprecated JSON write path
2016-03-16 14:17:23 -07:00
gunnaraasen d96eef4c52 Remove deprecated JSON write path 2016-03-15 19:52:41 -07:00
Jason Wilder defc594139 Add a build tag to disable all services except TCP endpoint 2016-03-15 20:27:01 -06:00
Cory LaNou 1d2c1faa94 address PR feedback 2016-03-14 16:55:54 +00:00
Cory LaNou d024ca2552 modify WritePoints function signature for p products 2016-03-14 16:55:54 +00:00
Cory LaNou cd84f26c34 remove startup check for monitoring 2016-03-14 16:55:54 +00:00
Cory LaNou 27cfaa4b7a in memory meta, single node configs, etc. 2016-03-14 16:55:54 +00:00
Philip O Toole b9cbff8ac4 Format all logging 2016-03-11 10:05:54 -08:00
Edd Robinson 7dbc0f49d3 Merge pull request #5818 from influxdata/er-upgrade-error
Highlight upgrade info for old shards
2016-03-09 19:39:59 +00:00
Jason Wilder 4e8b4c41b8 Fix rollback from 0.11
0.11 no longer uses some files from 0.10.  The code was a little
too aggressive and remove these files which would break rolling back
to 0.10 if necessary.  Since shards must be migrated to tsm before
upgrading to 0.11 and a user might not know they still have old shard
formats, they would not be able to revert back to 0.11 and migrate
them.

Also adds uptime stats to usage data.
2016-03-09 11:19:47 -07:00
Ben Johnson 41dde61226 SHOW SERIES 2016-03-08 11:47:57 -07:00
Todd Persen c7f8402dfe Get client version dynamically 2016-03-07 17:16:38 -08:00
Jonathan A. Sternberg 2f0e246757 Implemented the tag values iterator for `SHOW TAG VALUES`
`SHOW TAG VALUES` output has been modified to print the measurement name
for every measurement and to return the output in two columns: key and
value. An example output might be:

    > SHOW TAG VALUES WITH KEY IN (host, region)
    name: cpu
    ---------
    key     value
    host    server01
    region  useast

    name: mem
    ---------
    key     value
    host    server02
    region  useast

`measurementsByExpr` has been taught how to handle reserved keys (ones
with an underscore at the beginning) to allow reusing that function and
skipping over expressions that don't matter to the call.

Fixes #5593.
2016-03-06 09:52:34 -05:00
Ben Johnson eaed2aadcf Merge pull request #5811 from benbjohnson/remote-exec-2
Remote Execution
2016-02-25 09:05:17 -07:00
Ben Johnson 0dda9f6608 add remote execution
This commit adds remote execution to the query engine.
2016-02-25 08:41:20 -07:00
Ross McDonald 1a62cdbd9a Removed builtTime reference from influxd. Removed default version information from influxd. 2016-02-25 09:35:03 -06:00
Ross McDonald 6efd822810 Remove build time linker flag so that we can create reproducible builds. 2016-02-25 09:35:03 -06:00
Edd Robinson aa845cec7e Check for shards needing conversion. Fixes #5723 2016-02-25 13:21:13 +00:00
Goutham Veeramachaneni b1d7e59546 Lint cmd/ packages
Related to #4098

Lint cmd/influxd/

* Errors cannot end with punctuation
* Better comment for exported method
* Better control flow when return is present

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

Linted cmd/influx_tsm

* Added comments to exported fields
* Removed punctuation at the end of errors

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

Linted cmd/influx_tsm/b1 and cmd/influx_tsm/bz1

* Added comments to exportes fields

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

Linted cmd/influx_tsm/tsdb

* Added comments to exported fields
* range k, _ :=  can be written as range k :=
* removed else when return is present
* Added consistency to receiver names in methods

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

Fix typos

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2016-02-25 01:44:23 +05:30
Edd Robinson 8add49fd96 Ensures meta queries work in clusters.
Fixes #5612, #5573 and #5518.

Using the MetaExecuter, queries that need to run on both data nodes
and optionally the meta store will be executed across all data nodes
in the cluster.
2016-02-24 11:24:45 -05:00
David Norton 4d4e382ddf Add a Meta Executor.
The Meta Executor will make allow data nodes to execute queries
remotely on each other, via RPC calls.
2016-02-24 11:24:22 -05:00