Commit Graph

1773 Commits (0572f14a5c40a5828555d7f845f9a8bc4176430f)

Author SHA1 Message Date
Cory LaNou 775c5d243d Add changelog for 8187 2017-04-13 13:33:25 -05:00
Jason Wilder 91bfc5772a Update changelog 2017-04-04 16:39:53 -06:00
Jonathan A. Sternberg b1caafe82f Ensure the input for certain functions in the query engine are ordered
The following functions require ordered input but were not guaranteed to
received ordered input:

* `distinct()`
* `sample()`
* `holt_winters()`
* `holt_winters_with_fit()`
* `derivative()`
* `non_negative_derivative()`
* `difference()`
* `moving_average()`
* `elapsed()`
* `cumulative_sum()`
* `top()`
* `bottom()`

These function calls have now been modified to request that their input
be ordered by the query engine. This will prevent the improper output
that could have been caused by multiple series being merged together or
multiple shards being merged together potentially incorrectly when no
time grouping was specified.

Two additional functions were already correct to begin with (so there
are no bugs with these two, but I'm including their names for
completeness).

* `median()`
* `percentile()`
2017-04-04 09:20:43 -05:00
Jonathan A. Sternberg 3c0d1c1bb5 Fix a regression when math was used with selectors
If there were multiple selectors and math, the query engine would
mistakenly think it was the only selector in the query and would not
match their timestamps.

Fixed the query engine to pass whether the selector should be treated as
a selector so queries like `max(value) * 1, min(value) * 1` will match
the timestamps of the result.
2017-04-04 09:20:43 -05:00
Edd Robinson 57b8993e7b Reduce cost of admin user check
This commits adds a caching mechanism to the Data object, such that
when large numbers of users exist in the system, the cost of determining
if there is at least one admin user will be low.

To ensure that previously marshalled Data objects contain the correct
cached admin user value, we exhaustively determine if there is an admin
user present whenever we unmarshal a Data object.
2017-04-03 12:06:44 +01:00
Jason Wilder 2972a3f223 Remove MMAP derefencing code
This code was added to address some slow startup issues.  It is believed
to be the cause of some segfault panic's that occur at query time when
the underlying MMAP array has been unmapped.  The current structure of
code makes this change unnecessary now.
2017-03-23 12:46:23 -06:00
Jason Wilder c9740f753b Disable max-row-limit by default
max-row-limit was set at 10000 since 1.0, but due to a bug it was
effectively 0 (disabled).  1.2 fixed this bug via #7368, but this
caused a breaking change w/ Grafana and any users upgrading from <1.2
who had not disabled the config manually.
2017-03-14 12:47:32 -06:00
Jason Wilder 3ec60fe264 Update v1.2.1 release date 2017-03-08 12:26:07 -07:00
Jonathan A. Sternberg 83cf8893e1 Include IsRawQuery in the rewritten statement for meta queries 2017-03-06 14:46:33 -06:00
Jason Wilder eab012ef61 Fix points missing after compaction
If blocks containing overlapping ranges of time where partially
recombined, it was possible for the some points to get dropped
during compactions.  This occurred because the window of time of
the points we need to merge did not account for the partial blocks
created from a prior merge.

Fixes #8084
2017-03-06 10:17:11 -07:00
Jason Wilder 29f8d8de76 Fix race in WALEntry.Encode and Value.Deduplicate
Under high query load, a race exists in the cache and the WAL.  Since
writes currently hit the cache first, they are availble for query before
they hit the WAL.  If the WAL is writing and accessign the Value slice
at the same time that a query is run that needs to dedup the same slice,
a race occurs.

To fix this, the cache now just copies the values instead of storing the
slice passed in.  Another way to fix this might be to have the writes go
to the wal before the cache.  I think the latter would be better, but it
introduces some larger write path issues that we'd need to also address.
e.g. if the cache was full, writes to the WAL would need to be rejected
to avoid filling the disk.

Copying the slice in the cache is simpler for now and does not appear to
dramatically affect performance.
2017-03-06 09:38:22 -07:00
Ben Johnson 4c202eea09
Re-check field type under write lock. 2017-03-03 09:47:43 -07:00
Ben Johnson dffd12319c
Add point.UnmarshalBinary() bounds checking. 2017-03-01 12:01:25 -07:00
Jonathan A. Sternberg c5970b59b4 Map types correctly when selecting a field with multiple measurements where one of the measurements is empty 2017-03-01 11:47:26 -06:00
Jonathan A. Sternberg b942f3a373 Merge pull request #8069 from influxdata/js-8044-measurement-with-underscore-prefix
Treat non-reserved measurement names with underscores as normal measurements
2017-02-28 10:21:26 -06:00
Jason Wilder 414fe1349e Merge 1.1.4 changelog 2017-02-27 16:51:07 -07:00
Jonathan A. Sternberg 1081785cb4 Treat non-reserved measurement names with underscores as normal measurements
A measurement name that begins with an underscore and does not conflict
with one of the reserved measurement names will now be passed untouched
to the underlying shards rather than being intercepted as an empty
measurement.

A user still shouldn't rely on measurements that begin with underscores
to always be accessible, but this will prevent the most common use case
from causing unexpected behavior since we will very rarely, if ever, add
additional system sources.
2017-02-27 16:49:02 -06:00
Jonathan A. Sternberg 1fb34e3eef Dividing aggregate functions with different outputs doesn't panic 2017-02-23 18:38:29 -06:00
David Norton c7fa58473f fix #8028: call api.NewTypesDB() instead of new
The code was calling new(api.TypesDB) which didn't initialize an
unexported map inside of the type. Call api.NewTypesDB() instead.
2017-02-23 18:04:21 -05:00
Jonathan A. Sternberg 72e4dd01b9 Properly select a tag within a subquery
Previously, subqueries would only be able to select tag fields within a
subquery if the subquery used a selector. But, it didn't filter out
aggregates like `mean()` so it would panic instead.

Now it is possible to select the tag directly instead of rewriting the
query in an invalid way.

Some queries in this form will still not work though. For example, the
following still does not function (but also doesn't panic):

    SELECT count(host) FROM (SELECT mean, host FROM (SELECT mean(value) FROM cpu GROUP BY host))
2017-02-23 11:16:22 -06:00
Jonathan A. Sternberg 5a2b458180 Reduce the expression in a subquery to avoid a panic
The builder used for subqueries does not handle parenthesis, but a set
of parenthesis wrapping a field would cause it to panic. This code now
reduces the expression so the parenthesis are removed before being
processed.
2017-02-23 10:14:05 -06:00
Ben Johnson 78a9bb2527 Remove Tags.shouldCopy, replace with forceCopy on series creation.
Previously, tags had a `shouldCopy` flag to indicate if those tags
referenced an underlying buffer and should be copied to allow GC.
Unfortunately, this prevented tags from being copied that were
created and referenced the mmap which caused segfaults.

This change removes the `shouldCopy` flag and replaces it with a
`forceCopy` argument in `CreateSeriesIfNotExists()`. This allows
the write path to indicate that tags must be cloned on insert.
2017-02-21 11:13:35 -07:00
Ben Johnson 8e79ca5d75
Fix tag dereferencing panic.
Clones series tags under lock during var ref iterator creation.
2017-02-15 17:56:47 -07:00
Jonathan A. Sternberg 71f62d33e6 Map types correctly when using a regex and one of the measurements is empty 2017-02-13 18:14:29 -06:00
Jonathan A. Sternberg 2ad1668c2a Prevent a panic when aggregates are used in an inner query with a raw query
The following types of queries will panic:

    SELECT mean, host FROM (SELECT mean(value) FROM cpu GROUP BY host)
    SELECT top(sum, host, 3) FROM (SELECT sum(value) FROM cpu GROUP BY host)

These queries _should_ work, but due to a current limitation with
aggregate functions, the aggregate functions won't return any auxiliary
fields. So even if a tag is not an auxiliary field, it is treated that
way by the query engine and this query will fail.

Fixing this properly will take a longer period of time. This fix just
prevents the panic from killing the server while we fix this for real.
2017-02-08 11:44:56 -06:00
Jonathan A. Sternberg e1fa48d0dd Fix ORDER BY time DESC with ordering series keys
The order of series keys is in ascending alphabetical order, not
descending alphabetical order, when it is ordered by descending time.
This fixes the ordering so points are returned in descending order. The
emitter also had the conditions for choosing which iterator to use in
the wrong direction (which only affects aggregates with `FILL(none)`).
2017-02-06 15:49:12 -06:00
Jonathan A. Sternberg 95831b3307 Fix LIMIT and OFFSET when they are used in a subquery
This fixes LIMIT and OFFSET when they are used in a subquery where the
grouping of the inner query is different than the grouping of the outer
query. When organizing tag sets, the grouping of the outer query is
used so the final result is in the correct order. But, unfortunately,
the optimization incorrectly limited the number of points based on the
grouping in the outer query rather than the grouping in the inner query.

The ideal solution would be to use the outer grouping to further
organize it by the grouping for the inner subquery, but that's more
difficult to do at the moment. As an easier fix, the query engine now
limits the output of each series. This may result in these types of
queries being slower in some situations like this one:

    SELECT mean(value) FROM (SELECT value FROM cpu GROUP BY host LIMIT 1)

This will be slower in a situation where the `cpu` measurement has a
high cardinality and many different tags.

This also fixes `last()` and `first()` when they are used in a subquery
because those functions use `LIMIT 1` as an internal optimization.
2017-02-06 14:04:34 -06:00
Jonathan A. Sternberg caaad60dcf Fix authentication when subqueries are present
The code that checked if a query was authorized did not account for
sources that were subqueries. Now, the check for the required privileges
will descend into the subquery and add the subqueries required
privileges to the list of required privileges for the entire query.
2017-02-06 09:43:14 -06:00
Jonathan A. Sternberg e49ba016fa Fix incorrect math when aggregates that emit different times are used
When using `non_negative_derivative()` and `last()` in a math aggregate
with each other, the math would not be matched with each other because
one of those aggregates would emit one fewer point than the others. The
math iterators have been modified so they now track the name and tags of
a point and match based on those.

This isn't necessarily ideal and may come to bite us in the future. We
don't necessarily have a defined structure for all iterators so it can
be difficult to know which of two points is supposed to come first in
the ordering. This uses the common ordering that usually makes sense,
but the query engine is getting complicated enough where I am not 100%
certain that this is correct in all circumstances.
2017-02-02 14:40:41 -06:00
Joe LeGasse 37d4973609 updated CHANGELOG 2017-02-02 10:25:32 -05:00
Ben Johnson faef0a99c9
Perform series tag iteration under lock.
Adds a `tsdb.Series.ForEachTag()` function for safely iterating
over a series' tags within the context of a lock. This preverts
tags from being dereferenced during iteration which can cause
a seg fault.
2017-02-01 16:25:53 -07:00
Jonathan A. Sternberg e060fd0aa3 Fix EvalType when a parenthesis expression is used
It did not descend into the expression within the parenthesis correctly
and would just recurse infinitely on itself instead.
2017-01-31 10:35:21 -06:00
Jonathan A. Sternberg 92c5d336b4 Expand query dimensions from the subquery
During development, I, at some point, decided that the dimensions should
be expanded based on what was available rather than what was present in
the subquery. I don't really know the rationale for this because I
forgot, but it doesn't make sense or seem to be particularly useful.

Expanding dimensions now just uses the values specified in the subquery
rather than expanding to all available dimensions of the measurement in
the subquery.
2017-01-25 16:02:37 -06:00
Jonathan A. Sternberg 552408c949 Fix mapping of types when the measurement uses a regex
With the new shard mapper implementation, regexes were just ignored so
it attempted to look up the field type inside of a measurement with no
name (which cannot possibly exist) so it would think the field didn't
exist and map it as the unknown type.
2017-01-25 09:49:51 -06:00
Jason Wilder b7bb7e8359 Update 1.2.0 release date 2017-01-23 20:01:48 -07:00
Cory LaNou d54a955068 allow partial writes on field conflicts 2017-01-23 11:54:46 -07:00
Edd Robinson 320c5981cb Fixes racy locking on measurement 2017-01-17 09:44:56 -08:00
Edd Robinson c47be5bb56 Ensure subscriber service respects config 2017-01-13 22:15:01 +00:00
Mark Rushakoff 7964a87310 Update CHANGELOG.md 2017-01-12 16:31:56 -08:00
Joe LeGasse b19260fb26 Add some checks before removing directories
Fixes #7822

This change first ensures that databases and retention policies exist
before attempting to remove them from the Store. It also adds some
checks in the `DeleteDatabase` and `DeleteRetentionPolicy` to ensure
that maliciously named entries won't remove anything outside of the
configured data directory.
2017-01-12 17:38:10 -05:00
Jason Wilder 33be1e1952 Update changelog 2017-01-12 09:02:59 -07:00
Vladimir Lopes f05df2a263 Fix panic when pruning shard groups
* Fix #7812 - Panic when pruning shard groups

* Update CHANGELOG.md
2017-01-11 14:56:40 +00:00
Jonathan A. Sternberg 73b76d1227 Verbose output for SSL connection errors
When an error that appears to be an SSL error happens without SSL
enabled, the client will attempt to reconnect with SSL just to see if
that works. If it works, it exits with an error message telling the user
to add `-ssl`. It will also do the same if the SSL connection is unsafe
although it will warn that this is insecure.
2017-01-10 11:53:17 -06:00
Jonathan A. Sternberg d7c8c7ca4f Support subquery execution in the query language
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.

Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).

The syntax is like this:

    SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)

This will execute derivative and then sum the result of those derivatives.
Another example:

    SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)

This would let you find the maximum minimum value of each host.

There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:

    SELECT mean(value) FROM cpu
    SELECT mean(value) FROM (SELECT value FROM cpu)

Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
2017-01-07 13:00:48 -06:00
Mark Rushakoff 153277c01d Merge pull request #7786 from influxdata/mr-cache-decrease-size
Use one atomic operation in (*Cache).decreaseSize
2017-01-06 10:17:01 -08:00
Jason Wilder 15915446ff Merge pull request #7323 from miry/env-array-config
Allow add items to array config via ENV
2017-01-04 14:20:03 -07:00
Mark Rushakoff 89a587e865 Use one atomic operation in (*Cache).decreaseSize
The previous implementation was susceptible to a race condition (of
correctness) since c.decreaseSize is called without a lock in
(*Cache).WriteMulti.

There were already tests which asserted the correctness of the result of
decreaseSize, so no tests were added or modified.
2017-01-04 13:13:31 -08:00
Cory LaNou 3c518f8927
panicing is bad -> error returns are good 2017-01-03 14:28:29 -06:00
Mark Rushakoff 959c445a88 Fix broken return statements swallowing errors
There was no comment on either case specifying that the `return nil`
was deliberate instead of `return err`, so I'm assuming these were
typos. I added tests to conserve the error-returning behavior.
2017-01-03 08:50:34 -08:00
Michael Nikitochkin 5ebd4244b1 Merge branch 'master' into env-array-config 2017-01-02 16:35:55 +01:00