Commit Graph

90 Commits (de1a0eb2a919548b10f5a81bea427e1f268daf0b)

Author SHA1 Message Date
Sam Arnold dd3baf6d4a
feat: measurement metrics by login (#20687)
After turning on authentication and both forms of ingress metrics:

"ingress": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"cq","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":76}},
"ingress:1": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"database","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":152}},
"ingress:2": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"httpd","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":874}},
"ingress:3": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"ingress","rp":"monitor"},"values":{"pointsWritten":534,"valuesWritten":1068}},
"ingress:4": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"localStore","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":76}},
"ingress:5": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"queryExecutor","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":190}},
"ingress:6": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"runtime","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":570}},
"ingress:7": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"shard","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":836}},
"ingress:8": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"subscriber","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":114}},
"ingress:9": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_cache","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":684}},
"ingress:10": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_engine","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":2204}},
"ingress:11": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_filestore","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":152}},
"ingress:12": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_wal","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":304}},
"ingress:13": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"write","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":342}},
"ingress:14": {"name":"ingress","tags":{"db":"telegraf","login":"admin","measurement":"cpu","rp":"autogen"},"values":{"pointsWritten":1,"valuesWritten":1}},
"ingress:15": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"cpu","rp":"autogen"},"values":{"pointsWritten":1316,"valuesWritten":13160}},
"ingress:16": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"disk","rp":"autogen"},"values":{"pointsWritten":642,"valuesWritten":4494}},
"ingress:17": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"diskio","rp":"autogen"},"values":{"pointsWritten":214,"valuesWritten":2354}},
"ingress:18": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"mem","rp":"autogen"},"values":{"pointsWritten":107,"valuesWritten":963}},
"ingress:19": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"processes","rp":"autogen"},"values":{"pointsWritten":107,"valuesWritten":856}},
"ingress:20": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"swap","rp":"autogen"},"values":{"pointsWritten":214,"valuesWritten":642}},
"ingress:21": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"system","rp":"autogen"},"values":{"pointsWritten":321,"valuesWritten":749}},

Only by login:

"ingress": {"name":"ingress","tags":{"login":"_systemuser_monitor"},"values":{"pointsWritten":42,"valuesWritten":354}},
"ingress:1": {"name":"ingress","tags":{"login":"admin"},"values":{"pointsWritten":1,"valuesWritten":1}},
"ingress:2": {"name":"ingress","tags":{"login":"telegraf"},"values":{"pointsWritten":3547,"valuesWritten":28246}},

Notice writes by users 'telegraf', '_systemuser_monitor', and 'admin'.
2021-02-04 11:52:53 -05:00
davidby-influx 9e33be2619
fix(error): SELECT INTO doesn't return error with unsupported value (#20429)
When a SELECT INTO query generates an illegal value that cannot be inserted,
like +/- Inf, it should return an error, rather than failing silently.
This adds a boolean parameter to the [data] section of influxdb.conf:
* strict-error-handling
When false, the default, the old behavior is preserved.  When true,
unsupported values will return an error from SELECT INTO queries

Fixes https://github.com/influxdata/influxdb/issues/20426
2020-12-30 18:22:43 -08:00
Jonathan A. Sternberg 9f26eb7630
Drop NaN values when writing back points
When an NaN value was computed, it would be written back incorrectly as
a string type instead of being omitted. This happened very rarely in the
case that `stddev()` of a single value was computed and only when it was
being done on a new shard.

This correctly drops the value. The reason this wasn't correctly dropped
previously is because NaN values are represented as a `(*float64)(nil)`
which does not equal `nil` so the writeback system thought it was a
non-nil point, but the writer encoded it as a string.

In addition to the above, this also fixes the point writer to report the
number of points actually written rather than the number of points
desired to be written. Previously, if there was an error writing a point
for some reason, the point would be silently dropped, but still recorded
as a point that had been written. Now it reports the number of points
that were written and omits the ones that were dropped.
2018-12-06 10:55:46 -06:00
Jacob Marble 3cfbc33c0e Implement SHOW STATS FOR 'indexes' 2018-05-10 11:33:52 -07:00
Jacob Marble 232be14aef respect rp parameter in /query 2018-04-19 08:31:43 -07:00
Jonathan A. Sternberg df7a660fb3 Modify the Select call to return a Cursor
The Cursor returned will be capable of scanning rows into a structure.
It replaces part of the function for why the Emitter existed. The
Emitter would both join the resulting rows and then transform the values
into a models.Row so it could be returned to the results.

In the future, we will be able to use the Cursor directly to write out
values which should be more memory efficient.
2018-03-09 12:47:41 -06:00
Jonathan A. Sternberg 733d842812 Turn the ExecutionContext into a context.Context
Along with modifying ExecutionContext to be a context and have the
TaskManager return the context itself, this also creates a Monitor
interface and exposes the Monitor through the Context. This way, we can
access the monitor from within the query.Select method and keep all of
the limits inside of the query package instead of leaking them into the
statement executor.

An eventual goal is to remove the InterruptCh from the IteratorOptions
and use the Context instead, but for now, we'll just assign the done
channel from the Context to the IteratorOptions so at least they refer
to the same channel.
2018-03-08 14:03:20 -06:00
Jonathan A. Sternberg de4390ae83 Rename some of the structs and interfaces in the query package
Remove the `Query` prefix from some structs and interfaces. They were
there so when the query engine was in the same package as influxql,
these would be differentiated. Now that the package name is query, the
extra prefix seems redundant.
2018-03-02 09:44:12 -06:00
Edd Robinson 5a8f0202fb Ensure db specified for commands 2018-02-13 13:24:23 +00:00
Edd Robinson 67d1fa3972 Cleanup remaining packages 2018-01-21 12:08:25 -08:00
Edd Robinson 286c8f4c09 Return to original DELETE/DROP SERIES semantics
This reverts commit 59afd8cc90.
2018-01-15 12:00:30 +00:00
Edd Robinson 59afd8cc90 Return to original DELETE/DROP SERIES semantics
Since possibly v0.9 DELETE SERIES has had the unwanted side effect of
removing series from the index when the last traces of series data are
removed from TSM. This occurred because the inmem index was rebuilt on
startup, and if there was no TSM data for a series then there could be
not series to add to the index.

This commit returns to the original (documented) DROP/DETETE SERIES
behaviour. As such, when issuing DROP SERIES all instances of matching
series will be removed from both the TSM engine and the index. When
issuing DELETE SERIES only TSM data will be removed.

It is up to the operator to remove series from the index.

NB, this commit does not address how to remove series data from the
series file when a shard rolls over.
2017-12-15 00:02:06 +00:00
Edd Robinson 6851db3fc9 Add FGA support to SHOW MEASUREMENTS 2017-11-17 11:06:43 +00:00
Edd Robinson d581aee285 Ensure all retention policies queried 2017-11-08 16:27:57 +00:00
Ben Johnson 156f25ac23
Improve SHOW TAG KEYS performance. 2017-11-07 10:59:19 -07:00
Edd Robinson fbcb299b8a Support WHERE time clause in SHOW TAG VALUES
This commit adds time support to SHOW TAG VALUES. Time can be used as
both a lower and upper boundary. However, there are some caveats.

For the `inmem` index, filtering by time will still return all results
because the index data is shared across shards.

For the `tsi1` index, filtering by time will only work down to the shard
lever. Specifically, when querying by time all shards within that time
range will be used to generate the results.
2017-11-06 19:15:01 +00:00
Edd Robinson 98d584b63f Use index for SHOW X meta queries
When a meta query does not include a time component then it can be
answered exclusively by the index. This should result in a much faster
query execution that if the TSM engine was engaged.

This commit rewrites the following queries such that they make use
of the index where no time component is present:

  - SHOW MEASUREMENTS
  - SHOW SERIES
  - SHOW TAG KEYS
  - SHOW FIELD KEYS
2017-11-06 19:15:00 +00:00
Stuart Carnie f3d45ba301 influxdata/influxdb/influxql -> influxdata/influxql 2017-10-30 14:40:26 -07:00
Edd Robinson dd3206d796 Set column name for estimations 2017-10-26 16:22:48 +01:00
Edd Robinson 47c0840d5b SHOW TAG KEY EXACT CARDINALITY 2017-10-26 16:22:31 +01:00
Edd Robinson f80591bfa1 Implement MEASUREMENT cardinality estimation 2017-10-26 16:22:31 +01:00
Edd Robinson 3079b41f00 Implement series cardinality estimation 2017-10-26 16:22:31 +01:00
Stuart Carnie c51ba16287 fixes #9007 2017-10-25 13:08:55 -07:00
Stuart Carnie e9313876ab EXPLAIN ANALYZE
* Introduces EXPLAIN ANALYZE command, which
  produces a detailed tree of operations used to
  execute the query.

introduce context.Context to APIs

metrics package

* create groups of named measurements
* safe for concurrent access

tracing package

EXPLAIN ANALYZE implementation for OSS

Serialize EXPLAIN ANALYZE traces from remote nodes

use context.Background for tests

group with other stdlib packages

additional documentation and remove unused API

use influxdb/pkg/testing/assert

remove testify reference
2017-10-20 08:01:37 -07:00
Joe LeGasse 1443b22379 auth: add series auth to 'show tag values' 2017-09-27 20:01:18 -04:00
Jonathan A. Sternberg 50d404e690 Initial implementation of explain plan
It prints the statistics of each iterator that will access the storage
engine. For each access of the storage engine, it will print the number
of shards that will potentially be accessed, the number of files that
may be accessed, the number of series that will be created, the number
of blocks, and the size of those blocks.
2017-09-01 09:01:10 -05:00
Ben Johnson 1dbe0662d8
Use system cursors for measurement, series, and tag key meta queries. 2017-08-30 08:35:20 -06:00
Jonathan A. Sternberg 5593eecda6 Update parser and AST for explain statement 2017-08-28 11:36:06 -05:00
Jonathan A. Sternberg 8738e72cf1 Refactor the select call into three separate phases
The first call is to compile the query. This performs some initial
processing that can be done before having any access to the shards. At
the moment, it does very little, but it's intended to be changed to
eventually perform initial validations of the query and create an
internal graph structure for the execution of the query.

The second call is to prepare the query. This step has access to the
shard mapper. Right now, it just maps the shards and rewrites the fields
of the query for any wildcards. In the future, it is intended to do the
above, but also to prepare the final directed acyclical graph that will
execute the query.

The third call is to select the query. This step is intended to create
all of the iterators for processing the query. At the moment, much of
the work intended for the second step is performed in the third step.
2017-08-25 07:50:13 -05:00
Jonathan A. Sternberg 421a91d480 Pass the select options to the shard mapper again 2017-08-24 09:55:02 -05:00
Jonathan A. Sternberg 96689e661e Move query engine code from the statement executor to the query engine
The statement rewriting logic should be in the query engine as part of
preparing a query. This creates a shard mapper interface that the query
engine expects and then passes it to the query engine instead of
requiring the query to be preprocessed before being input into the query
engine. This interface is (mostly) the same as the old interface, just
moved to a different package.
2017-08-23 10:07:30 -05:00
Jonathan A. Sternberg 8bd04ebe39 Remove TimeRange function and replace with a more accurate ConditionExpr function
The ConditionExpr function is more accurate because it parses the
condition and ensures that time conditions are actually used correctly.
That means that attempting to combine conditions with OR will not result
in the query silently pretending it's an AND and nested conditions work
correctly so there is only one way to read the query.

It also extracts the non-time conditions into a separate condition so we
can stop attempting to parse around the time conditions in lower layers
of the storage engine. This change does not remove those hacks, but a
following commit should be able to sanitize the condition and remove
them.
2017-08-16 16:45:35 -05:00
Jonathan A. Sternberg 9a2357c2c0 Separate the query engine into a separate package
This change provides a clear separation between the query engine
mechanics and the query language so that the language can be parsed and
dealt with separate from the query engine itself.
2017-08-16 13:38:43 -05:00
Jonathan A. Sternberg 950753d036 Parse time literals using the time zone in the select statement 2017-07-27 13:05:51 -05:00
Joe LeGasse 815f740f4c initial fga work
wip

wip

fix tests / build
2017-05-26 13:16:27 -07:00
Edd Robinson fddaff2cc8 Merge master in 2017-03-29 18:00:28 +01:00
Jonathan A. Sternberg 347b01814e Support timezone offsets for queries
The timezone for a query can now be added to the end with something like
`TZ("America/Los_Angeles")` and it will localize the results of the
query to be in that timezone. The offset will automatically be set to
the offset for that timezone and offsets will automatically adjust for
daylight savings time so grouping by a day will result in a 25 hour day
once a year and a 23 hour day another day of the year.

The automatic adjustment of intervals for timezone offsets changing will
only happen if the group by period is greater than the timezone offset
would be. That means grouping by an hour or less will not be affected by
daylight savings time, but a 2 hour or 1 day interval will be.

The default timezone is UTC and existing queries are unaffected by this
change.

When times are returned as strings (when `epoch=1` is not used), the
results will be returned using the requested timezone format in RFC3339
format.
2017-03-22 15:09:41 -05:00
Ben Johnson 358b1e0b05
Merge remote-tracking branch 'upstream/master' into tsi 2017-03-15 10:13:32 -06:00
Mark Rushakoff 53699aa24f Allow non-admin users to execute SHOW DATABASES
This commit introduces a new interface type, influxql.Authorizer, that
is passed as part of a statement's execution context and determines
whether the context is permitted to access a given database. In the
future, the Authorizer interface may be expanded to other resources
besides databases. In this commit, the Authorizer interface is
specifically used to determine which databases are returned when
executing SHOW DATABASES.

When HTTP authentication is enabled, the existing meta.UserInfo struct
implements Authorizer, meaning admin users can SHOW every database, and
non-admin users can SHOW only databases for which they have read and/or
write permission.

When HTTP authentication is disabled, all databases are visible through
SHOW DATABASES.

This addresses a long-standing issue where Chronograf or Grafana would
be unable to list databases if the logged-in user did not have admin
privileges.

Fixes #4785.
2017-02-13 08:59:16 -08:00
Edd Robinson 91ee34b111 Merge pull request #7837 from influxdata/er-tidy
General tidy up and subtle bug fixes
2017-01-26 13:43:07 +00:00
Ben Johnson 047c21f4d9
Merge remote-tracking branch 'upstream/master' into tsi 2017-01-24 09:28:58 -07:00
Edd Robinson 0804cdb7b5 Ensure rp names validated in CREATE DATABASE WITH 2017-01-23 19:00:19 +00:00
Edd Robinson fb7388cdfc Remove dead code from various pkgs 2017-01-17 09:47:34 -08:00
Joe LeGasse 2db0250b22 Add db/rp name validation
This change adds some very basic name validation with the following
plain-english description: names must be non-zero sequence of printable
characters that do not contain slashes ('/' or '\') and are not equal to
either "." or "..".

The intent is that, since we currently just use database and retention
policy names directly as path elements, these rules will hopefully leave
us with names that should be at least close to valid directory names.

Ideally, we would restrict names even further or not use them as path
elements directly, but this should be a step towards the former without
restricting names "too much"
2017-01-12 17:38:10 -05:00
Joe LeGasse b19260fb26 Add some checks before removing directories
Fixes #7822

This change first ensures that databases and retention policies exist
before attempting to remove them from the Store. It also adds some
checks in the `DeleteDatabase` and `DeleteRetentionPolicy` to ensure
that maliciously named entries won't remove anything outside of the
configured data directory.
2017-01-12 17:38:10 -05:00
Jonathan A. Sternberg d7c8c7ca4f Support subquery execution in the query language
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.

Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).

The syntax is like this:

    SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)

This will execute derivative and then sum the result of those derivatives.
Another example:

    SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)

This would let you find the maximum minimum value of each host.

There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:

    SELECT mean(value) FROM cpu
    SELECT mean(value) FROM (SELECT value FROM cpu)

Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
2017-01-07 13:00:48 -06:00
Ben Johnson 9f8b206b51
Fix measurement system queries. 2017-01-05 10:15:34 -07:00
Mark Rushakoff 1d3da81a7d Update godoc for the coordinator package. 2016-12-30 11:58:43 -08:00
Jonathan A. Sternberg b4db76cee2 Introduce syntax for marking a partial response with chunking
The `partial` tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.

This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the `partial` field included with the
response to know if it should expect more.

Fixed `max-row-limit` so it counts rows instead of results and it
truncates the response when the `max-row-limit` is reached.
2016-11-22 11:16:22 -06:00
Jonathan A. Sternberg 64c2d704da Avoid deadlock when max-row-limit is hit
When the `max-row-limit` was hit, the goroutine reading from the results
channel would stop reading from the channel, but it didn't signal to the
sender that it was no longer reading from the results. This caused the
sender to continue trying to send results even though nobody would ever
read it and this created a deadlock.

Include an `AbortCh` on the `ExecutionContext` that will signal when
results are no longer desired so the sender can abort instead of
deadlocking.
2016-11-08 13:12:28 -06:00