Commit Graph

2001 Commits (618f0d0aa75ff2839974caf1dbd2082a743adc65)

Author SHA1 Message Date
Jason Wilder c25f7b8b3f Fix duplicate points returned after delete
The sortedSeriesIds slice was not getting reset to 0 which caused
the same series ids to exist in the slice more than once.  Since
the size of the slice never matched the size of the seriesID map,
it kept appendending to the slice and sorting it which cause multiple
cursor to get created for the same series.

Fixes #8531
2017-07-10 10:37:01 -06:00
Stuart Carnie 649beba8b3 update CHANGELOG 2017-07-08 08:46:27 -07:00
Adam 7ac4f5b8c2 Merge pull request #8574 from influxdata/plutonium#1240_master
Add X-Influxdb-Build  value to HTTP response header
2017-07-07 12:04:58 -04:00
Edd Robinson d2ce7060c5 Merge branch 'master' into backup_stdout 2017-07-07 16:27:39 +01:00
Edd Robinson 0d7059af04 Update CHANGELOG 2017-07-07 16:23:38 +01:00
Edd Robinson a43238618e Merge pull request #8512 from axiomhq/loglogbeta
Switch to LogLog-Beta Cardinality estimation
2017-07-07 16:14:16 +01:00
Adam 460b30bd08 removed blank line from changelog 2017-07-07 11:06:01 -04:00
Adam 2259ada8c3 adds a new header key/value X-Influxdb-Build that has value OSS if called from open-source build, and ENT if called from enterprise. This commit sets the value for the OSS case, and also creates the proper flag 2017-07-07 10:59:26 -04:00
Jason Wilder 3e7dfad7c4 Compress tombstone files
This adds a v3 format that is a gzip compressed version of the v2
format.  It reduces the size of tombstone files substantially without
having to support a more feature rich file format for tombstones.
2017-07-06 10:10:31 -06:00
Edd Robinson 7374e4e8a4 Merge pull request #8550 from influxdata/er-8548-panic
Allow panic recovery mechanism to be disabled
2017-07-05 22:09:09 +01:00
Edd Robinson 101af89987 Update CHANGELOG 2017-07-05 16:35:41 +01:00
Edd Robinson 12248b7233 Allow panic recovery to be disabled
This commit adds a new environment variable INFLUXDB_PANIC_CRASH, which
when set to a truthy value, e.g., true, TRUE, 1, will prevent the server
from recovering from a panic.

Recover currently occurs in two places: the HTTP handler and the
QueryExecutor. INFLUXDB_PANIC_CRASH will control both.

Further, this commit adds _internal stats that will monitor the
occurrence of panics all the time (regardless of if INFLUXDB_PANIC_CRASH
has been set to true or not).

The recovered panic frequency can be inspected with the following
queries:

SELECT "recoveredPanics" FROM "_internal"."monitor"."httpd";
SELECT "recoveredPanics" FROM "_internal"."monitor"."queryExecutor";
2017-06-29 19:44:25 +01:00
Ben Johnson 9e64813db8
Defer unlock all write locks in inmem index.
Currently two write locks in `inmem` are obtained and then
manually unlocked at function exit points. However, we have
reports that the `inmem` index is hanging on a write lock and
cannot track the issue down to anything else besides a lock
that could have been left unlocked because of a panic.

This commit changes the two locks to always defer their unlocks
to prevent these hangs.
2017-06-29 10:23:13 -06:00
Edd Robinson e1a5ee4ede Ensure privileges can't be set on non-existent DB 2017-06-26 14:15:52 +01:00
Adam b12ee4e6ee 8426: updated code based on pull-request feedback 2017-06-23 14:59:01 -04:00
Adam 6ac3e0ab98 8426 Updated changelog to more clearly describe the new feature 2017-06-23 09:22:14 -04:00
Adam 0fb2de16ee 8426 updated changelog 2017-06-22 18:56:38 -04:00
Jason Wilder 12c7063566 Update changelog 2017-06-21 09:20:39 -06:00
Seif Lotfy 643b2eb30c Switch to LogLog-Beta Cardinality estimation
The new algorithm uses only one formula and needs no additional bias corrections for the entire range of cardinalities,
therefore, it is more efficient and simpler to implement. Our simulations show that the accuracy provided by the new
algorithm is as good as or better than the accuracy provided by either of HyperLogLog or HyperLogLog++. The sparse
representation was kept in to provide better low cardinality accuracy. However the linear counting and range estimations
are replaced.
2017-06-20 15:25:01 +02:00
Stuart Carnie 932edd90b2 Merge branch 'master' into sgc-8188 2017-06-14 10:55:06 +10:00
Stuart Carnie 3657dbc256 update CHANGELOG with key names 2017-06-14 10:37:04 +10:00
Jonathan A. Sternberg e2c1da05e4 Merge pull request #8480 from influxdata/js-stats-interval
Change the default stats interval to 1 second instead of 10 seconds
2017-06-13 12:08:23 -05:00
Ben Johnson b51f604030
Fix TSI non-contiguous compaction panic.
This fixes the case where log files are compacted out of order
and cause non-contiguous sets of index files to be compacted.

Previously, the compaction planner would fetch a list of index files
for each level and compact them in order starting with the oldest
ones. This can be a problem for level 1 because level 0 (log files)
are compacted individually and in some cases a log file can finish
compacting before older log files are finished compacting. This
causes there to be a gap in the list of level 1 files that is
ignored when fetching a list of index files.

Now, the planner reads the list of index files starting from the
oldest but stops once it hits a log file. This prevents that gap
from being ignored.
2017-06-13 10:53:26 -06:00
Jonathan A. Sternberg f7382982fd Change the default stats interval to 1 second instead of 10 seconds 2017-06-12 13:15:25 -05:00
marchtea 16dfe2a0ae update CHANGELOG.md 2017-06-12 11:00:27 +08:00
Stuart Carnie 2de52834f0 CQ statistics written to monitor database, addresses #8188
* off by default, enabled by `query-stats-enabled`
* writes to cq_query measurement of configured monitor database
* see CHANGELOG for schema of individual points
2017-06-10 09:20:38 +08:00
Ben Johnson bcc6ef769b
Check file count before attempting a TSI level compaction.
This check was previously in a different section of code which
was lost during a refactor to the new compaction strategy. The
compaction planning now makes a check to ensure at least two
files are available for compaction in a level.
2017-06-06 11:08:59 -06:00
Stuart Carnie 98f2050bcb Update config sample and CHANGELOG 2017-06-05 22:05:00 +08:00
Ben Johnson 3128c6a42e
Fix SHOW TAG VALUES deduplication. 2017-06-01 15:38:35 -06:00
Jonathan A. Sternberg 6a78f1cf4a URL query parameter credentials take priority over Authentication header 2017-05-30 09:26:24 -05:00
Jason Wilder f1181cc402 Update changelog 2017-05-24 14:47:01 -06:00
Jonathan A. Sternberg 9edf236cc8 Maintain the tags of points selected by top() or bottom() when writing the results
When a `SELECT ... INTO ...` is used with `top()` or `bottom()` used
with tags, the points will be written with the tags still intact instead
of converted to fields.
2017-05-23 15:00:21 -05:00
Ryan Betts b18a7e8deb Merge pull request #8396 from influxdata/changelog-merges
Add 1.2.4 and 1.1.5 CHANGELOG updates.
2017-05-23 14:31:09 -04:00
Jason Wilder 31d2309177 Update changelog 2017-05-22 14:53:06 -06:00
Jonathan A. Sternberg 4bdce21a9a Merge pull request #8394 from influxdata/js-top-bottom-performance
Optimize top() and bottom() using an incremental aggregator
2017-05-19 14:32:55 -05:00
Jason Wilder 55f2f83e34 Merge pull request #8407 from influxdata/jw-8392
Return partial write error when points outside of retention policy ar…
2017-05-19 11:25:08 -06:00
Jonathan A. Sternberg 7b9b55bfc0 Optimize top() and bottom() using an incremental aggregator
The previous version of `top()` and `bottom()` would gather all of the
points to use in a slice, filter them (if necessary), then use a
slightly modified heap sort to retrieve the top or bottom values.

This performed horrendously from the standpoint of memory. Since it
consumed so much memory and spent so much time in allocations (along
with sorting a potentially very large slice), this affected speed too.

These calls have now been modified so they keep the top or bottom points
in a min or max heap. For `top()`, a new point will read the minimum
value from the heap. If the new point is greater than the minimum point,
it will replace the minimum point and fix the heap with the new value.
If the new point is smaller, it discards that point. For `bottom()`, the
process is the opposite.

It will then sort the final result to ensure the correct ordering of the
selected points.

When `top()` or `bottom()` contain a tag to select, they have now been
modified so this query:

    SELECT top(value, host, 2) FROM cpu

Essentially becomes this query:

    SELECT top(value, 2), host FROM (
        SELECT max(value) FROM cpu GROUP BY host
    )

This should drastically increase the performance of all `top()` and
`bottom()` queries.
2017-05-19 11:56:46 -05:00
Jason Wilder afb1027bed Return partial write error when points outside of retention policy are dropped
Writing points outside of a retention policy range were silently dropped. They
are dropped to prevent creating a shard that will be immediately deleted.  These
dropped points were silent and did not return an error respone to the caller.

Fixes #8392
2017-05-19 10:50:03 -06:00
Jonathan A. Sternberg 7d043dbc61 Add nanosecond duration literal support 2017-05-19 10:44:11 -05:00
Edd Robinson a5fed3d296 Merge pull request #7862 from influxdata/er-debug-all
Adds handler for returning a profile archive
2017-05-17 17:09:39 +01:00
Ryan Betts a43856adc6 Add 1.2.4 and 1.1.5 CHANGELOG updates. 2017-05-16 16:51:29 -04:00
Edd Robinson 1cbbaa9317 Add support for shards, stats and diagnostics 2017-05-15 14:12:00 +01:00
Edd Robinson 8f8ff0ec61 Adds handler for returning a profile archive
Currently, when debugging issues with InfluxDB we often ask for the
following profiles:

  curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1"
  curl -o goroutine.txt
"http://localhost:8086/debug/pprof/goroutine?debug=1"
  curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"
  curl -o cpu.txt "http://localhost:8086/debug/pprof/profile

This can be bothersome for users, or even difficult if they're
unfamiliar with cURL (or it's not on their system).

This commit adds a new endpoint: /debug/pprof/all which will return a
single compressed archive of all of the above profiles. The CPU profile
is optional, and not returned by default. To include a CPU profile the
URL to request should be: /debug/pprof/all?cpu=true. It's also possible
to vary the length of the CPU profile by adding a `seconds=x` parameter,
where x defaults to 30, if absent.

The new command for gathering profiles from users should now be:

  curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all"

Or, if we need to see a CPU profile:

  curl -o profiles.tar.gz
"http://localhost:8086/debug/pprof/all?cpu=true"

It's important to remember that a CPU profile is a blocking operation
and by default it will take 30 seconds for the response to be returned
to the user.

Finally, if the user is unfamiliar with cURL, they will now be able to
visit http://localhost:8086/debug/pprof/all in a web browser, and the
archive will be downloaded to their machine.
2017-05-15 14:11:38 +01:00
Mark Rushakoff 6f438ea467 Update CHANGELOG 2017-05-12 17:09:09 -07:00
Jason Wilder 0b7c0b680c Update changelog 2017-05-12 14:05:24 -06:00
Jonathan A. Sternberg dea02009e0 Small edits to the etc/config.sample.toml file 2017-05-10 10:56:34 -05:00
Jonathan A. Sternberg 2780630a5f Track HTTP client requests for /write and /query with /debug/requests
After using `/debug/requests`, the client will wait for 30 seconds
(configurable by specifying `seconds=` in the query parameters) and the
HTTP handler will track every incoming query and write to the system.
After that time period has passed, it will output a JSON blob that looks
very similar to `/debug/vars` that shows every IP address and user
account (if authentication is used) that connected to the host during
that time.

In the future, we can add more metrics to track. This is an initial
start to aid with debugging machines that connect too often by looking
at a sample of time (like `/debug/pprof`).
2017-05-09 10:18:33 -05:00
Jason Wilder 29c2b1958e Fix deletes triggering unnecessary compactions
Tombstone files would be written to all TSM files even if the deleted
keys or timerange did not exist in the TSM file.  This had the side
effect of causing shards to get recompacted back to the same state. If
any shards or large numbers of TSM files existed, disk usage and CPU
utilization would spike causing issues.

This prevents tombstones being written for TSM files that could not
possiby contain the series keys being deleted or if the delted time
range is outside the range of the file.
2017-05-08 14:52:28 -06:00
Ben Johnson 489c89bea4
Add tsi support tooling. 2017-05-08 11:00:15 -06:00
Jonathan A. Sternberg 260bdef3d4 Set the CSV output to an empty string for null values 2017-05-04 20:51:58 -05:00
Jason Wilder 684f5d884a Update changelog 2017-05-03 16:31:57 -06:00
Jonathan A. Sternberg df30a4d9c9 Refactor the subquery code and fix outer condition queries
This change refactors the subquery code into a separate builder class to
help allow for more reuse and make the functions smaller and easier to
read.

The previous function that handled most of the code was too big and
impossible to reason through.

This also goes and replaces the complicated logic of aggregates that had
a subquery source with the simpler IteratorMapper. I think the overhead
from the IteratorMapper will be more, but I also believe that the actual
code is simpler and more robust to produce more accurate answers. It
might be a future project to optimize that section of code, but I don't
have any actual numbers for the efficiency of one method and I believe
accuracy and code clarity may be more important at the moment since I am
otherwise incapable of reading my own code.
2017-04-28 17:12:32 -05:00
Jonathan A. Sternberg addc12561f Fix LIMIT and OFFSET for certain aggregate queries
When LIMIT and OFFSET were used with any functions that were not handled
directly by the query engine (anything other than count, max, min, mean,
first, or last), the input to the function would be limited instead of
receiving the full stream of values it was supposed to receive.

This also fixes a bug that caused the server to hang when LIMIT and
OFFSET were used with a selector. When using a selector, the limit and
offset should be handled before the points go to the auxiliary iterator
to be split into different iterators. Limiting happened afterwards which
caused the auxiliary iterator to hang forever.
2017-04-28 15:55:06 -05:00
Ben Johnson 3a46e5dd9e
Remove default upper time bound for DELETE queries. 2017-04-28 12:26:26 -06:00
Jason Wilder a736f186f0 Merge pull request #8327 from influxdata/jw-go181
Update to go 1.8.1
2017-04-27 08:42:30 -06:00
Jonathan A. Sternberg be3bce5212 top() and bottom() now returns the time for every point
`top()` and `bottom()` will now organize the points by time and also
keep the points original time even when a time grouping is used. At the
same time, `top()` and `bottom()` will no longer honor any fill options
that are present since they don't really make sense for these specific
functions.

This also fixes the aggregate and selectors to honor the ordered
iterator option so iterator remain ordered and to also respect the
buckets that are created by the final dimensions of the query so that
two buckets don't overlap each other within the same reducer. A test has
been added for this situation. This should clarify and encourage the use
of the ordered attribute within the query engine.
2017-04-26 15:07:10 -05:00
Jonathan A. Sternberg 4776b216a4 Merge pull request #8253 from influxdata/js-8065-restrict-top-bottom-query
Restrict top() and bottom() selectors to be used with no other functions
2017-04-26 15:06:30 -05:00
Jason Wilder 4db3b69b9d Update to go1.8.1 2017-04-26 11:32:42 -06:00
Jonathan A. Sternberg 1300f4cc6c Remove the admin UI 2017-04-25 16:58:24 -05:00
Jason Wilder 71825d20c8 Update changelog 2017-04-20 12:31:06 -06:00
Jason Wilder 5c51ae7319 Merge branch '1.2' into jw-merge-123 2017-04-14 14:36:54 -06:00
Cory LaNou 8c0f5a7dbe
redact passwords before saving history in cli 2017-04-14 13:13:56 -05:00
Jonathan A. Sternberg 57a2abbc87 Restrict top() and bottom() selectors to be used with no other functions 2017-04-14 10:23:07 -05:00
Cory LaNou 775c5d243d Add changelog for 8187 2017-04-13 13:33:25 -05:00
Cory LaNou f96b59ed20 Add changelog for 8187 2017-04-13 10:31:31 -05:00
Jonathan A. Sternberg a550d323c4 Restrict fill(none) and fill(linear) to be usable only with aggregate queries 2017-04-10 15:58:05 -05:00
Jonathan A. Sternberg 0a5e4bd92b Implicitly cast null to false in binary expressions with a boolean
Also more consistently treat a binary expression with strings so it
produces the same value no matter the direction of the expression.
2017-04-06 12:26:04 -05:00
Jonathan A. Sternberg 45895862b7 Merge pull request #8058 from karlding/service-golinting
Make services/{admin, httpd, subscriber, udp} golintable
2017-04-05 12:30:11 -05:00
Jason Wilder 91bfc5772a Update changelog 2017-04-04 16:39:53 -06:00
Jason Wilder 5fa8073fc2 Merge branch '1.2' into jw-merge-123 2017-04-04 11:12:06 -06:00
Jonathan A. Sternberg b1caafe82f Ensure the input for certain functions in the query engine are ordered
The following functions require ordered input but were not guaranteed to
received ordered input:

* `distinct()`
* `sample()`
* `holt_winters()`
* `holt_winters_with_fit()`
* `derivative()`
* `non_negative_derivative()`
* `difference()`
* `moving_average()`
* `elapsed()`
* `cumulative_sum()`
* `top()`
* `bottom()`

These function calls have now been modified to request that their input
be ordered by the query engine. This will prevent the improper output
that could have been caused by multiple series being merged together or
multiple shards being merged together potentially incorrectly when no
time grouping was specified.

Two additional functions were already correct to begin with (so there
are no bugs with these two, but I'm including their names for
completeness).

* `median()`
* `percentile()`
2017-04-04 09:20:43 -05:00
Jonathan A. Sternberg 3c0d1c1bb5 Fix a regression when math was used with selectors
If there were multiple selectors and math, the query engine would
mistakenly think it was the only selector in the query and would not
match their timestamps.

Fixed the query engine to pass whether the selector should be treated as
a selector so queries like `max(value) * 1, min(value) * 1` will match
the timestamps of the result.
2017-04-04 09:20:43 -05:00
Ryan Betts 4b1977673c Merge branch 'master' into timhallinflux-patch-2-1 2017-04-04 09:57:19 -04:00
Cory LaNou bc0759d0fc redact passwords before saving history in cli 2017-04-03 11:20:14 -05:00
Cory LaNou 7a6243eb58 Merge pull request #8122 from influxdata/cjl-influx-suppress-headers
suppress headers in output for influx when they are the same
2017-04-03 10:07:48 -05:00
Edd Robinson 57b8993e7b Reduce cost of admin user check
This commits adds a caching mechanism to the Data object, such that
when large numbers of users exist in the system, the cost of determining
if there is at least one admin user will be low.

To ensure that previously marshalled Data objects contain the correct
cached admin user value, we exhaustively determine if there is an admin
user present whenever we unmarshal a Data object.
2017-04-03 12:06:44 +01:00
Karl b22783f127 Update CHANGELOG 2017-04-02 18:33:09 -04:00
Tom Young d2fd3f50aa Add bitwise AND, OR and XOR operators to InfluxQL. 2017-03-31 21:02:02 +01:00
Jonathan A. Sternberg 211e7ea65d Merge pull request #8234 from influxdata/js-8230-fix-window-computation-overflow
Prevent overflowing or underflowing during window computation
2017-03-31 11:09:30 -05:00
zhexuany 232fdae6dd introduce a new function non_negative_difference 2017-03-31 23:08:36 +08:00
Jonathan A. Sternberg 64fb1db5f5 Prevent overflowing or underflowing during window computation
The Window function will now check before it adjusts the offset whether
it is going to overflow or underflow. If it is going to do either, it
sets the start or end time to MinTime or MaxTime.
2017-03-30 16:35:22 -05:00
timhallinflux e50ad6615d Update CHANGELOG.md 2017-03-30 12:46:53 -07:00
timhallinflux 9f88e63ccf Changelog addition for #8231 2017-03-30 12:44:59 -07:00
Jonathan A. Sternberg 2ea805c928 Interpolate between different intervals to find the whole area under the curve 2017-03-30 12:51:52 -05:00
Edd Robinson 7644ab1fc4 Fix race in test helpers. Fixes #8177 2017-03-29 12:31:04 +01:00
Edd Robinson 45f843fc91 Don't unassign shards when system shutting down 2017-03-29 11:57:38 +01:00
Jonathan A. Sternberg 7e0ed1f5e5 Ensure the input for certain functions in the query engine are ordered
The following functions require ordered input but were not guaranteed to
received ordered input:

* `distinct()`
* `sample()`
* `holt_winters()`
* `holt_winters_with_fit()`
* `derivative()`
* `non_negative_derivative()`
* `difference()`
* `moving_average()`
* `elapsed()`
* `cumulative_sum()`
* `top()`
* `bottom()`

These function calls have now been modified to request that their input
be ordered by the query engine. This will prevent the improper output
that could have been caused by multiple series being merged together or
multiple shards being merged together potentially incorrectly when no
time grouping was specified.

Two additional functions were already correct to begin with (so there
are no bugs with these two, but I'm including their names for
completeness).

* `median()`
* `percentile()`
2017-03-28 13:55:37 -05:00
Jonathan A. Sternberg 24109468c3 Merge pull request #8168 from influxdata/js-8167-math-with-multiple-selectors
Fix a regression when math was used with selectors
2017-03-28 13:31:57 -05:00
Cory LaNou 9f674ccec4
suppress headers in output for influx when they are the same 2017-03-28 12:50:23 -05:00
Jonathan A. Sternberg 3e52ec7ca2 Merge pull request #7762 from influxdata/js-6541-timezone-support
Support timezone offsets for queries
2017-03-28 10:39:07 -05:00
Cory LaNou b4d61d4ef4 Merge pull request #8119 from influxdata/cjl-influx-chunked
add chunked/chunk size as setting/options in cli
2017-03-28 10:38:56 -05:00
Jonathan A. Sternberg b14c292cba Fix a regression when math was used with selectors
If there were multiple selectors and math, the query engine would
mistakenly think it was the only selector in the query and would not
match their timestamps.

Fixed the query engine to pass whether the selector should be treated as
a selector so queries like `max(value) * 1, min(value) * 1` will match
the timestamps of the result.
2017-03-27 14:12:15 -05:00
Jonathan A. Sternberg ccf0cb8371 Fix query parser when using addition and subtraction without spaces
Additionally, support unary addition and subtraction for variables,
calls, and parenthesis expressions. Doing `-value` will be the
equivalent of doing `-1 * value` now.
2017-03-24 12:52:19 -05:00
Jason Wilder 2972a3f223 Remove MMAP derefencing code
This code was added to address some slow startup issues.  It is believed
to be the cause of some segfault panic's that occur at query time when
the underlying MMAP array has been unmapped.  The current structure of
code makes this change unnecessary now.
2017-03-23 12:46:23 -06:00
Jonathan A. Sternberg 347b01814e Support timezone offsets for queries
The timezone for a query can now be added to the end with something like
`TZ("America/Los_Angeles")` and it will localize the results of the
query to be in that timezone. The offset will automatically be set to
the offset for that timezone and offsets will automatically adjust for
daylight savings time so grouping by a day will result in a 25 hour day
once a year and a 23 hour day another day of the year.

The automatic adjustment of intervals for timezone offsets changing will
only happen if the group by period is greater than the timezone offset
would be. That means grouping by an hour or less will not be affected by
daylight savings time, but a 2 hour or 1 day interval will be.

The default timezone is UTC and existing queries are unaffected by this
change.

When times are returned as strings (when `epoch=1` is not used), the
results will be returned using the requested timezone format in RFC3339
format.
2017-03-22 15:09:41 -05:00
Jonathan A. Sternberg 33981277bc Fix the time range when an exact timestamp is selected
There is a lot of confusion in the code if the range is [start, end) or
[start, end]. This is not made easier because it is acts one way in some
areas and in another way in some other areas, but it is usually [start,
end]. The `time = ?` syntax assumed that it was [start, end) and added
an extra nanosecond to the end time to accomodate for that, but the
range was actually [start, end] and that caused it to include one extra
nanosecond when it shouldn't have.

This change fixes it so exactly one timestamp is selected when `time = ?`
is used.
2017-03-21 14:55:31 -05:00
Jonathan A. Sternberg a6c09e58a0 Return an error when an invalid duration literal is parsed 2017-03-21 12:10:41 -05:00
Edd Robinson f89de550ed Significantly speed up DROP DATABASE 2017-03-21 11:35:31 +00:00
Edd Robinson 255992f5ec Reduce cost of admin user check
This commits adds a caching mechanism to the Data object, such that
when large numbers of users exist in the system, the cost of determining
if there is at least one admin user will be low.

To ensure that previously marshalled Data objects contain the correct
cached admin user value, we exhaustively determine if there is an admin
user present whenever we unmarshal a Data object.
2017-03-20 12:04:03 +00:00
Cory LaNou e07d84525d
add chunked/chunksize as setting/options 2017-03-17 18:25:52 -05:00
Jason Wilder 00306336ee Update changelog 2017-03-17 16:13:36 -06:00
Jonathan A. Sternberg 5fba1bdcd3 Update liner dependency to handle docker exec
The liner dependency now handles the scenario where the terminal width
is reported as zero. Previously, liner would panic when it tried to
divide by the width (which was zero). Now it falls back onto a dumb
prompt rather than attempting to use a smart prompt and panicking.
2017-03-17 08:42:53 -05:00
Jonathan A. Sternberg 41c8370bbc Fix fill(linear) when multiple series exist and there are null values
When there were multiple series and anything other than the last series
had any null values, the series would start using the first point from
the next series to interpolate points.

Interpolation should not cross between series. Now, the linear fill
checks to make sure the next point is within the same series before
using it to perform interpolation.
2017-03-16 15:54:20 -05:00
Jonathan A. Sternberg 5072db40c2 Forbid wildcards in binary expressions
When rewriting fields, wildcards within binary expressions were skipped.
This now throws an error whenever it finds a wildcard within a binary
expression in order to prevent the panic that occurs.
2017-03-16 14:26:10 -05:00
Jonathan A. Sternberg 9cdfdd04e9 Do not increment the continuous query statistic if no query is run
Instead of incrementing the `queryOk` statistic with or without the
continuous query running, it will only increment when the query is
actually executed.
2017-03-16 10:36:00 -05:00
Jason Wilder e9eb925170 Coalesce multiple WAL fsyncs
Fsyncs to the WAL can cause higher IO with lots of small writes or
slower disks.  This reworks the previous wal fsyncing to remove the
extra goroutine and remove the hard-coded 100ms delay.  Writes to
the wal still maintain the invariant that they do not return to the
caller until the write is fsync'd.

This also adds a new config options wal-fsync-delay (default 0s)
which can be increased if a delay is desired.  This is somewhat useful
for system with slower disks, but the current default works well as
is.
2017-03-15 16:31:03 -06:00
Jonathan A. Sternberg 208d8507f1 Implement both single and multiline comments in influxql
A single line comment will read until the end of a line and is started
with `--` (just like SQL). A multiline comment is with `/* */`. You
cannot nest multiline comments.
2017-03-15 14:24:09 -05:00
Jason Wilder 713a1d2fab Merge pull request #8137 from influxdata/jw-merge-12
Merge 1.2.2 to master
2017-03-14 17:51:34 -06:00
Jason Wilder a16d86ebaa Merge pull request #8136 from influxdata/sort-changelog
Place CHANGELOG.md in descending sort order.
2017-03-14 15:27:46 -06:00
Jason Wilder e62c72d1f9 Merge branch '1.2' into jw-merge-12 2017-03-14 15:15:50 -06:00
Ryan Betts 46fa0c33a0 Place CHANGELOG.md in descending sort order. 2017-03-14 16:38:06 -04:00
Jason Wilder c9740f753b Disable max-row-limit by default
max-row-limit was set at 10000 since 1.0, but due to a bug it was
effectively 0 (disabled).  1.2 fixed this bug via #7368, but this
caused a breaking change w/ Grafana and any users upgrading from <1.2
who had not disabled the config manually.
2017-03-14 12:47:32 -06:00
Mark Rushakoff 32a961005d Update CHANGELOG 2017-03-14 11:34:49 -07:00
Edd Robinson bdc10a0e51 Failed imports will now alter exit code 2017-03-14 15:57:21 +00:00
Jason Wilder b9e5375043 Merge branch '1.2' into jw-merge-12 2017-03-08 13:16:50 -07:00
Jason Wilder 3ec60fe264 Update v1.2.1 release date 2017-03-08 12:26:07 -07:00
Jonathan A. Sternberg 83cf8893e1 Include IsRawQuery in the rewritten statement for meta queries 2017-03-06 14:46:33 -06:00
Jason Wilder 675d7c9d65 Merge branch '1.2' into jw-merge12 2017-03-06 11:09:05 -07:00
Jason Wilder eab012ef61 Fix points missing after compaction
If blocks containing overlapping ranges of time where partially
recombined, it was possible for the some points to get dropped
during compactions.  This occurred because the window of time of
the points we need to merge did not account for the partial blocks
created from a prior merge.

Fixes #8084
2017-03-06 10:17:11 -07:00
Jason Wilder 29f8d8de76 Fix race in WALEntry.Encode and Value.Deduplicate
Under high query load, a race exists in the cache and the WAL.  Since
writes currently hit the cache first, they are availble for query before
they hit the WAL.  If the WAL is writing and accessign the Value slice
at the same time that a query is run that needs to dedup the same slice,
a race occurs.

To fix this, the cache now just copies the values instead of storing the
slice passed in.  Another way to fix this might be to have the writes go
to the wal before the cache.  I think the latter would be better, but it
introduces some larger write path issues that we'd need to also address.
e.g. if the cache was full, writes to the WAL would need to be rejected
to avoid filling the disk.

Copying the slice in the cache is simpler for now and does not appear to
dramatically affect performance.
2017-03-06 09:38:22 -07:00
Ben Johnson 4c202eea09
Re-check field type under write lock. 2017-03-03 09:47:43 -07:00
Ben Johnson dffd12319c
Add point.UnmarshalBinary() bounds checking. 2017-03-01 12:01:25 -07:00
Jonathan A. Sternberg c5970b59b4 Map types correctly when selecting a field with multiple measurements where one of the measurements is empty 2017-03-01 11:47:26 -06:00
Jonathan A. Sternberg b942f3a373 Merge pull request #8069 from influxdata/js-8044-measurement-with-underscore-prefix
Treat non-reserved measurement names with underscores as normal measurements
2017-02-28 10:21:26 -06:00
Jason Wilder 414fe1349e Merge 1.1.4 changelog 2017-02-27 16:51:07 -07:00
Jonathan A. Sternberg 1081785cb4 Treat non-reserved measurement names with underscores as normal measurements
A measurement name that begins with an underscore and does not conflict
with one of the reserved measurement names will now be passed untouched
to the underlying shards rather than being intercepted as an empty
measurement.

A user still shouldn't rely on measurements that begin with underscores
to always be accessible, but this will prevent the most common use case
from causing unexpected behavior since we will very rarely, if ever, add
additional system sources.
2017-02-27 16:49:02 -06:00
Jonathan A. Sternberg 1fb34e3eef Dividing aggregate functions with different outputs doesn't panic 2017-02-23 18:38:29 -06:00
David Norton c7fa58473f fix #8028: call api.NewTypesDB() instead of new
The code was calling new(api.TypesDB) which didn't initialize an
unexported map inside of the type. Call api.NewTypesDB() instead.
2017-02-23 18:04:21 -05:00
Jonathan A. Sternberg 72e4dd01b9 Properly select a tag within a subquery
Previously, subqueries would only be able to select tag fields within a
subquery if the subquery used a selector. But, it didn't filter out
aggregates like `mean()` so it would panic instead.

Now it is possible to select the tag directly instead of rewriting the
query in an invalid way.

Some queries in this form will still not work though. For example, the
following still does not function (but also doesn't panic):

    SELECT count(host) FROM (SELECT mean, host FROM (SELECT mean(value) FROM cpu GROUP BY host))
2017-02-23 11:16:22 -06:00
Jonathan A. Sternberg 5a2b458180 Reduce the expression in a subquery to avoid a panic
The builder used for subqueries does not handle parenthesis, but a set
of parenthesis wrapping a field would cause it to panic. This code now
reduces the expression so the parenthesis are removed before being
processed.
2017-02-23 10:14:05 -06:00
Jason Wilder a024003f2c Merge branch '1.2' into jw-merge-12 2017-02-22 12:13:29 -07:00
Ben Johnson 78a9bb2527 Remove Tags.shouldCopy, replace with forceCopy on series creation.
Previously, tags had a `shouldCopy` flag to indicate if those tags
referenced an underlying buffer and should be copied to allow GC.
Unfortunately, this prevented tags from being copied that were
created and referenced the mmap which caused segfaults.

This change removes the `shouldCopy` flag and replaces it with a
`forceCopy` argument in `CreateSeriesIfNotExists()`. This allows
the write path to indicate that tags must be cloned on insert.
2017-02-21 11:13:35 -07:00
Mark Rushakoff 601cbcd084 Merge branch '1.2' into mr-merge-12 2017-02-17 16:14:22 -08:00
Ben Johnson 8e79ca5d75
Fix tag dereferencing panic.
Clones series tags under lock during var ref iterator creation.
2017-02-15 17:56:47 -07:00
Jonathan A. Sternberg 71f62d33e6 Map types correctly when using a regex and one of the measurements is empty 2017-02-13 18:14:29 -06:00
Mark Rushakoff c762ab49ee Merge pull request #7974 from influxdata/mr-4785-show-databases
Allow non-admin users to execute SHOW DATABASES
2017-02-13 15:04:00 -08:00
Jason Wilder c3de210ded Merge branch '1.2' into jw-merge-12 2017-02-13 11:45:27 -07:00
Mark Rushakoff 53699aa24f Allow non-admin users to execute SHOW DATABASES
This commit introduces a new interface type, influxql.Authorizer, that
is passed as part of a statement's execution context and determines
whether the context is permitted to access a given database. In the
future, the Authorizer interface may be expanded to other resources
besides databases. In this commit, the Authorizer interface is
specifically used to determine which databases are returned when
executing SHOW DATABASES.

When HTTP authentication is enabled, the existing meta.UserInfo struct
implements Authorizer, meaning admin users can SHOW every database, and
non-admin users can SHOW only databases for which they have read and/or
write permission.

When HTTP authentication is disabled, all databases are visible through
SHOW DATABASES.

This addresses a long-standing issue where Chronograf or Grafana would
be unable to list databases if the logged-in user did not have admin
privileges.

Fixes #4785.
2017-02-13 08:59:16 -08:00
jgeiger 43117a94d6 Add chunked processing back into v2 client
- Moving the to v2 client removed this functionality. This copies
  code back into the client. The tests were also added back into
  the test suite.
2017-02-13 09:21:13 -07:00
Jonathan A. Sternberg a0d8c1ca9f Add modulo operator to the query language 2017-02-10 10:16:37 -06:00
Jason Wilder 0d9fd8a37b Merge pull request #7948 from CAFxX/gzip_encoder_pool
[influxd] Use sync.Pool to reuse gzip.Writers across requests
2017-02-09 09:04:24 -07:00
Jonathan A. Sternberg 2ad1668c2a Prevent a panic when aggregates are used in an inner query with a raw query
The following types of queries will panic:

    SELECT mean, host FROM (SELECT mean(value) FROM cpu GROUP BY host)
    SELECT top(sum, host, 3) FROM (SELECT sum(value) FROM cpu GROUP BY host)

These queries _should_ work, but due to a current limitation with
aggregate functions, the aggregate functions won't return any auxiliary
fields. So even if a tag is not an auxiliary field, it is treated that
way by the query engine and this query will fail.

Fixing this properly will take a longer period of time. This fix just
prevents the panic from killing the server while we fix this for real.
2017-02-08 11:44:56 -06:00
Jason Wilder 1bc0f68490 Merge branch '1.2' into jw-merge-12 2017-02-07 12:48:36 -07:00
Jonathan A. Sternberg e1fa48d0dd Fix ORDER BY time DESC with ordering series keys
The order of series keys is in ascending alphabetical order, not
descending alphabetical order, when it is ordered by descending time.
This fixes the ordering so points are returned in descending order. The
emitter also had the conditions for choosing which iterator to use in
the wrong direction (which only affects aggregates with `FILL(none)`).
2017-02-06 15:49:12 -06:00
Carlo Alberto Ferraris a6a7782e04 [influxd] Use a sync.Pool to reuse gzip.Writer across requests
This brings alloc_space down from ~20200M to ~10700M in a run of
go test ./cmd/influxd/run -bench=Server -memprofile=mem.out -run='^$'
2017-02-07 05:23:58 +09:00
Jonathan A. Sternberg 95831b3307 Fix LIMIT and OFFSET when they are used in a subquery
This fixes LIMIT and OFFSET when they are used in a subquery where the
grouping of the inner query is different than the grouping of the outer
query. When organizing tag sets, the grouping of the outer query is
used so the final result is in the correct order. But, unfortunately,
the optimization incorrectly limited the number of points based on the
grouping in the outer query rather than the grouping in the inner query.

The ideal solution would be to use the outer grouping to further
organize it by the grouping for the inner subquery, but that's more
difficult to do at the moment. As an easier fix, the query engine now
limits the output of each series. This may result in these types of
queries being slower in some situations like this one:

    SELECT mean(value) FROM (SELECT value FROM cpu GROUP BY host LIMIT 1)

This will be slower in a situation where the `cpu` measurement has a
high cardinality and many different tags.

This also fixes `last()` and `first()` when they are used in a subquery
because those functions use `LIMIT 1` as an internal optimization.
2017-02-06 14:04:34 -06:00
Jonathan A. Sternberg caaad60dcf Fix authentication when subqueries are present
The code that checked if a query was authorized did not account for
sources that were subqueries. Now, the check for the required privileges
will descend into the subquery and add the subqueries required
privileges to the list of required privileges for the entire query.
2017-02-06 09:43:14 -06:00
Jason Wilder 2e95b4043c Merge branch '1.2' into jw-merge-12 2017-02-02 16:40:36 -07:00
Jonathan A. Sternberg e49ba016fa Fix incorrect math when aggregates that emit different times are used
When using `non_negative_derivative()` and `last()` in a math aggregate
with each other, the math would not be matched with each other because
one of those aggregates would emit one fewer point than the others. The
math iterators have been modified so they now track the name and tags of
a point and match based on those.

This isn't necessarily ideal and may come to bite us in the future. We
don't necessarily have a defined structure for all iterators so it can
be difficult to know which of two points is supposed to come first in
the ordering. This uses the common ordering that usually makes sense,
but the query engine is getting complicated enough where I am not 100%
certain that this is correct in all circumstances.
2017-02-02 14:40:41 -06:00
Joe LeGasse 9abd5ba46f updated CHANGELOG 2017-02-02 10:49:30 -05:00