Commit Graph

32 Commits (2322ec4e8f4479dbb92e5080e18f534a1c5229da)

Author SHA1 Message Date
Stuart Carnie f3d45ba301 influxdata/influxdb/influxql -> influxdata/influxql 2017-10-30 14:40:26 -07:00
Edd Robinson f80591bfa1 Implement MEASUREMENT cardinality estimation 2017-10-26 16:22:31 +01:00
Edd Robinson 3079b41f00 Implement series cardinality estimation 2017-10-26 16:22:31 +01:00
Stuart Carnie e9313876ab EXPLAIN ANALYZE
* Introduces EXPLAIN ANALYZE command, which
  produces a detailed tree of operations used to
  execute the query.

introduce context.Context to APIs

metrics package

* create groups of named measurements
* safe for concurrent access

tracing package

EXPLAIN ANALYZE implementation for OSS

Serialize EXPLAIN ANALYZE traces from remote nodes

use context.Background for tests

group with other stdlib packages

additional documentation and remove unused API

use influxdb/pkg/testing/assert

remove testify reference
2017-10-20 08:01:37 -07:00
Joe LeGasse 1443b22379 auth: add series auth to 'show tag values' 2017-09-27 20:01:18 -04:00
Jonathan A. Sternberg 50d404e690 Initial implementation of explain plan
It prints the statistics of each iterator that will access the storage
engine. For each access of the storage engine, it will print the number
of shards that will potentially be accessed, the number of files that
may be accessed, the number of series that will be created, the number
of blocks, and the size of those blocks.
2017-09-01 09:01:10 -05:00
Jonathan A. Sternberg 9a2357c2c0 Separate the query engine into a separate package
This change provides a clear separation between the query engine
mechanics and the query language so that the language can be parsed and
dealt with separate from the query engine itself.
2017-08-16 13:38:43 -05:00
Stuart Carnie 47f97ea134 use parsed measurement and models.Tags 2017-05-26 13:21:59 -07:00
Joe LeGasse 815f740f4c initial fga work
wip

wip

fix tests / build
2017-05-26 13:16:27 -07:00
Ben Johnson 358b1e0b05
Merge remote-tracking branch 'upstream/master' into tsi 2017-03-15 10:13:32 -06:00
Mark Rushakoff 601cbcd084 Merge branch '1.2' into mr-merge-12 2017-02-17 16:14:22 -08:00
Jonathan A. Sternberg 2fe48d6781 Rename zap import back to github.com/uber-go/zap
They rebased a revision we were previously relying upon that allowed us
to use the vanity name so we are reverting back to an older version with
the old import path.
2017-02-17 17:17:22 -06:00
Mark Rushakoff 53699aa24f Allow non-admin users to execute SHOW DATABASES
This commit introduces a new interface type, influxql.Authorizer, that
is passed as part of a statement's execution context and determines
whether the context is permitted to access a given database. In the
future, the Authorizer interface may be expanded to other resources
besides databases. In this commit, the Authorizer interface is
specifically used to determine which databases are returned when
executing SHOW DATABASES.

When HTTP authentication is enabled, the existing meta.UserInfo struct
implements Authorizer, meaning admin users can SHOW every database, and
non-admin users can SHOW only databases for which they have read and/or
write permission.

When HTTP authentication is disabled, all databases are visible through
SHOW DATABASES.

This addresses a long-standing issue where Chronograf or Grafana would
be unable to list databases if the logged-in user did not have admin
privileges.

Fixes #4785.
2017-02-13 08:59:16 -08:00
Ben Johnson 047c21f4d9
Merge remote-tracking branch 'upstream/master' into tsi 2017-01-24 09:28:58 -07:00
Jonathan A. Sternberg d7c8c7ca4f Support subquery execution in the query language
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.

Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).

The syntax is like this:

    SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)

This will execute derivative and then sum the result of those derivatives.
Another example:

    SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)

This would let you find the maximum minimum value of each host.

There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:

    SELECT mean(value) FROM cpu
    SELECT mean(value) FROM (SELECT value FROM cpu)

Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
2017-01-07 13:00:48 -06:00
Edd Robinson 0f9b2bfe6a
Fix tests 2017-01-05 10:16:15 -07:00
Ben Johnson 409b0165f5
shared in-memory index 2017-01-05 10:09:57 -07:00
Gustav Westling 26b33307ae
Resolved PR comments on test files 2016-12-30 11:42:38 +01:00
Gustav Westling 56d98325da
Removed ineffective assignments, and added checks for errors that previsouly was not checked 2016-12-29 20:26:15 +01:00
Jonathan A. Sternberg ec57108520 Use proper uber-go/zap import path
It looks like the real import path to the project is go.uber.org/zap
instead of github.com/uber-go/zap since the example in the project
references that path.
2016-12-15 08:54:14 -06:00
Jonathan A. Sternberg 21502a39e8 Switch logging to use structured logging everywhere
The logging library has been switched to use uber-go/zap. While the
logging has been changed to use structured logging, this commit does not
change any of the logging statements to take advantage of the new
structured log or new log levels. Those changes will come in future
commits.
2016-12-14 10:45:15 -06:00
Jason Wilder 0b6f5441b9 Add config option to messages when limits exceeded
When a limit is exceeded, we return errors and sometimes log (if appropriate)
that a limit was exceeded.  The messages don't always provide an indication
as to where or how they are configured.

Instead, return the config option (easily searchable for) as well as the limit
currently set and the value that exceeded it when possible.
2016-10-28 14:54:45 -06:00
Jason Wilder d105e344c2 Don't normalize drop/delete series statements
7093 causes a parse error to be returned from delete and drop
statements.  Normalizing them cause an invalid statement to be generated
which cannot be reparse if converted to a string and back.
2016-10-27 16:21:07 -06:00
Ben Johnson 55b3e63ced
concurrent series limit
This commit fixes the `MaxSelectSeriesN` limit which was broken by
the implementation of lazy iterators. The setting previously limited
the total number of series but the new implementation limits the
concurrent number of series being processed.
2016-08-09 08:58:01 -06:00
Jonathan A. Sternberg 86bd97f3b9 Switch SHOW MEASUREMENTS and SHOW TAG VALUES to directly access the tsdb.Store
The `SHOW MEASUREMENTS` and `SHOW TAG VALUES` cannot go through the
query engine to get the speed they need. They also only need access to
the database index and do not need access to specific shards. This
removes the query rewriting that was done to turn these two queries into
a select statement and reimplements them inside of the coordinator as an
interface on the TSDBStore.
2016-07-28 17:38:11 -05:00
Ben Johnson 7d4bea7153
add node id to execution options
This commit changes the `ExecutionOptions` and `SelectOptions` to
allow a `NodeID` for specifying an exact node to query against.
2016-06-10 09:20:44 -06:00
Jonathan A. Sternberg b972c220aa Merge pull request #6757 from influxdata/js-refactor-execute-query
Refactor ExecuteQuery to take options as a struct
2016-06-07 10:35:52 -05:00
Ben Johnson 1b94cd2686
optimize SHOW TAG VALUES
This commit optimizes `SHOW TAG VALUES` so that it avoids the
`SELECT` query engine execution and iterator creation. There
are also optimizations to reduce individual memory allocations
and to reduce in-memory heap size by only operating on one
measurement at a time.

Execution time has been reduce to approximately 900ms for
500,000 rows. This is about 2µs per row. Of this time,
approximately 1µs is spent retrieving and sorting the row
and 1µs is spent encoding into JSON and writing to the
response body.
2016-06-06 15:50:53 -06:00
Jason Wilder a74ea4cbf4 Allow creating shards in a disable state
For restoring a shard, we need to be able to have the shard open,
but disabled.  It was racy to open it and then disable it separately
since writes/queries could occur in between that time.
2016-06-01 16:17:18 -06:00
Jonathan A. Sternberg 71c8e9e567 Refactor ExecuteQuery to take options as a struct
This allows us to add additional options to ExecuteQuery without
creating parameter bloat.

Removing the unused Series structs. Their necessity was removed by a
previous commit, but the structs were not removed yet.

Add another type of interrupt iterator that monitors the interrupt
channel and calls `Close()` on the iterator when the interrupt happens.
It will primarily be used for asynchronously closing the ReaderIterator,
but it will only close the read side of the connection properly. More
work needs to be done to allow closing the write side efficiently.
2016-06-01 12:30:52 -05:00
Jonathan A. Sternberg 23f6a706bb Support cast syntax for selecting a specific type
Casting syntax is done with the PostgreSQL syntax `field1::float` to
specify which type should be used when selecting a field. You can also
do `field1::field` or `tag1::tag` to specify that a field or tag should
be selected.

This makes it possible to select a tag when a field key and a tag key
conflict with each other in a measurement. It also means it's possible
to choose a field with a specific type if multiple shards disagree. If
no types are given, the same ordering for how a type is chosen is used
to determine which type to return.

The FieldDimensions method has been updated to return the data type for
the fields that get returned. The SeriesKeys function has also been
removed since it is no longer needed. SeriesKeys was originally used for
the fill iterator, but then expanded to be used by auxiliary iterators
for determining the channel iterator types. The fill iterator doesn't
need it anymore and the auxiliary types are better served by
FieldDimensions implementing that functionality, so SeriesKeys is no
longer needed.

Fixes #6519.
2016-05-16 12:08:29 -04:00
Jason Wilder 6cc1a34704 Rename cluster package to coordinator 2016-05-11 11:41:05 -06:00