They rebased a revision we were previously relying upon that allowed us
to use the vanity name so we are reverting back to an older version with
the old import path.
This commit introduces a new interface type, influxql.Authorizer, that
is passed as part of a statement's execution context and determines
whether the context is permitted to access a given database. In the
future, the Authorizer interface may be expanded to other resources
besides databases. In this commit, the Authorizer interface is
specifically used to determine which databases are returned when
executing SHOW DATABASES.
When HTTP authentication is enabled, the existing meta.UserInfo struct
implements Authorizer, meaning admin users can SHOW every database, and
non-admin users can SHOW only databases for which they have read and/or
write permission.
When HTTP authentication is disabled, all databases are visible through
SHOW DATABASES.
This addresses a long-standing issue where Chronograf or Grafana would
be unable to list databases if the logged-in user did not have admin
privileges.
Fixes#4785.
This change adds some very basic name validation with the following
plain-english description: names must be non-zero sequence of printable
characters that do not contain slashes ('/' or '\') and are not equal to
either "." or "..".
The intent is that, since we currently just use database and retention
policy names directly as path elements, these rules will hopefully leave
us with names that should be at least close to valid directory names.
Ideally, we would restrict names even further or not use them as path
elements directly, but this should be a step towards the former without
restricting names "too much"
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.
Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).
The syntax is like this:
SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)
This will execute derivative and then sum the result of those derivatives.
Another example:
SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)
This would let you find the maximum minimum value of each host.
There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:
SELECT mean(value) FROM cpu
SELECT mean(value) FROM (SELECT value FROM cpu)
Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
There was no comment on either case specifying that the `return nil`
was deliberate instead of `return err`, so I'm assuming these were
typos. I added tests to conserve the error-returning behavior.
It looks like the real import path to the project is go.uber.org/zap
instead of github.com/uber-go/zap since the example in the project
references that path.
The logging library has been switched to use uber-go/zap. While the
logging has been changed to use structured logging, this commit does not
change any of the logging statements to take advantage of the new
structured log or new log levels. Those changes will come in future
commits.
The url must have a scheme of udp,http,https and a port number.
CREATE SUBSCRIPTION will fail if there are invalid destinations.
Additionally Service.createSubscription fail invalid destinations are detected.
Fixes#7615
The `partial` tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.
This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the `partial` field included with the
response to know if it should expect more.
Fixed `max-row-limit` so it counts rows instead of results and it
truncates the response when the `max-row-limit` is reached.
When the `max-row-limit` was hit, the goroutine reading from the results
channel would stop reading from the channel, but it didn't signal to the
sender that it was no longer reading from the results. This caused the
sender to continue trying to send results even though nobody would ever
read it and this created a deadlock.
Include an `AbortCh` on the `ExecutionContext` that will signal when
results are no longer desired so the sender can abort instead of
deadlocking.
There are 2 new keys in the configuration file.
- security-level: "none", "sign", or "encrypt".
- auth-file: The location of the user/password file.
Please see the collectd network doc for more details.
The admin console would dynamically discover the version from the
InfluxDB server, but for patch releases, it included the patch in the
link to the documentation and that wasn't a valid link.
Truncate the version so the documentation url is correct since we only
do documentation for `major.minor`.
This changes the behavior of the max-series-per-database and
max-values-per-tag limits to drop points that would exceed the limits
and allow the remaining points to be written. Previously, the whole
batch would fail and return and 500 error to the client.
This now will write the allow points and return a `partial write`
error indicating some of the points were dropped, how many were
dropped and one of the problem measureent and tags.
The subscriber write goroutine would drop points if the write load
was higher than it could process. This could happen with a just
a few writers to the server.
Instead, process the channel with multiple writers to avoid dropping
writes so easily. This also adds some config options to control how
large the channel buffer is as well as how many goroutines are started.
Fixes#7330
The tags passed into Statistics() calls are not supposed to be modified.
The balanceWriter in subscribers tried to modify them triggering a panic
because they can be nil.
The vet checks for some files did not pass for go 1.7. As part of a
preliminary start to making go 1.7 work with this software, go vet
should pass.
Also updated the gogo/protobuf dependency which fixed the code generator
to work with go 1.7 too. Ran `go generate` on the entire repository to
ensure every file was up to date.
When we refactored expvar, the cmdline and memstats sections were not
readded to the output. This adds it back if they can be found inside of
`expvar`.
It also stops trying to sort the output of the statistics so they get
returned faster. JSON doesn't need them to be sorted and it causes
enough latency problems that sorting them hurts performance.
Instead of having the parser set the defaults, the command will set the
defaults so that the constants for that are actually used. This way we
can also identify which things the user provided and which ones we are
filling with default values.
This allows the meta client to be able to make smarter decisions when
determining if the user requested a conflict or if the requested
capabilities match with what is currently available. If you just say
`CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the
duration of the retention policy is and just wants to use the default.
Now, we can use that information to determine if an existing retention
policy would conflict with what the user requested rather than returning
an error if a default value ever gets changed since the meta client
command can communicate intent more easily.
Previously, we implicitly added a newline and had to add one to the
number of bytes transmitted because we added that byte. That was removed
at some point and the metric was not updated to record the correct
value.
The query killing functionality depends on the ResponseWriter exposing a
CloseNotify method. Since we wrap the http.ResponseWriter, the new
struct does not have that method and the HTTP handler would skip past
calling that method.
Instead of duplicating `Flush()` and `CloseNotify()` for every response
formatter, we will unify all of that under a single struct and create
formatters instead.
Also, fixes a bug where the header information from a query would not be
returned until some other data was returned with it because of
buffering and another bug in the gzipResponseWriter that wouldn't flush
the actual underlying ResponseWriter.
The query can be uploaded from a file using `multipart/form-data` and
setting the file name to `q`. An example of using curl to execute an
async query would be:
curl -F "q=@database.iql" -F "async=true" http://localhost:8086/query
It will return a 204 No Content as long as the query is accepted
(immediate errors will be returned, but not individual errors with
specific queries). The only way to kill the query is by using the task
manager.
CSV doesn't offer a way to separate different sheets from each other and
it doesn't really have a standard format. We separate sheets with a
newline so they can be imported into something like Excel or LibreOffice
more easily.
The number of columns for each sheet is inferred from the first returned
row in each statement since they should all be the same.
Instead of having the parser set the defaults, the command will set the
defaults so that the constants for that are actually used. This way we
can also identify which things the user provided and which ones we are
filling with default values.
This allows the meta client to be able to make smarter decisions when
determining if the user requested a conflict or if the requested
capabilities match with what is currently available. If you just say
`CREATE DATABASE WITH NAME myrp`, the user doesn't really care what the
duration of the retention policy is and just wants to use the default.
Now, we can use that information to determine if an existing retention
policy would conflict with what the user requested rather than returning
an error if a default value ever gets changed since the meta client
command can communicate intent more easily.
This commit limits queries to only process one shard at a time.
However, within a shard, multiple series can still be processed in
parallel. Shard iterators are lazily instantiated during query
execution to limit the amount of memory a given query uses.
The `Content-Type` header is meant to say what the content type of the
request is supposed to be and `Accept` is used to ask for a specific
content type. I messed this up and used `Content-Type` instead of
`Accept`. This works with `GET` requests accidentally, but `POST`
requests stop working.
According to the HTTP standard, a lack of authentication credentials or
incorrect authentication credentials should send back a 401
(Unauthorized) with a `WWW-Authenticate` header with a challenge that
can be used to authenticate. This is because a 401 status should be sent
when an authentication attempt can be retried by the browser.
The 403 (Forbidden) status code should be sent when authentication
succeeded, but the user does not have the necessary authorization.
Previously, the server would always send a 401 status code.
Truncate the time interval output of the monitor service to be on even
time intervals rather than on every minute based on the start time. This
normalizes the output from the monitor service.
Similar to #1339 .
Fix getClientVersion() on setups where the admin interface resides on some
other path than root path due to some reverse proxy setup.
Normalize the output for the various help options so they all follow the
same format and display all relevant options.
Removing some of the unused config options from the configuration file
and updating the help documentation. Removing some remaining references
to clustering within the open source version.
The tsdb package had a substantial amount of dead code related to the
old query engine still in there. It is no longer used, so it was removed
since it was left unmaintained. There is likely still more code that is
the same, but wasn't found as part of this code cleanup.
influxql has dead code show up because of the code generation so it is
not included in this pruning.
The highest time represented by a nanosecond needs to be used for an
exclusive range, so the maximum time needs to be one less than the
possible maximum number of nanoseconds representable by an int64 so that
we don't lose a point at that one time.
Previously worked in the open source version because the timestamp used
for finding a shard would be truncated by the retention policy so the
lookup time didn't run into this edge case because it didn't rest on the
truncation boundary. Since that point didn't really belong in that shard
group and was placed there by mistake, it's best to fix this bug since
the timestamp used to create the shard group should be capable of
retrieving it.
changes the httpd log lines from this:
[httpd] 2016/06/08 14:06:39 ::1 - - [08/Jun/2016:14:06:39 +0100] POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1 204 0 - InfluxDBClient d6aa01fc-2d79-11e6-8024-000000000000 2.751391ms
to this:
[httpd] ::1 - - [08/Jun/2016:14:06:39 +0100] "POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" d6aa01fc-2d79-11e6-8024-000000000000 2751
So it changes a few things:
1. Remove the logger timestamp at the beginning which isn't very relevant anyways
2. adds quotes around "METHOD URI PROTOCOL", because this is part of the
common log format.
3. adds quotes around "AGENT" and "REFERRER" because this is part of the
"combined" log format.
4. Puts the response time in integer microseconds, because this is
consistent with apache's %D config mod option.
Compared with CLF, our logs now look like this:
[httpd] %{COMMON_LOG_FORMAT} "<agent>" "<referrer>" <request_uuid> <response_time_µs>
For reference, see:
https://en.wikipedia.org/wiki/Common_Log_Formathttp://httpd.apache.org/docs/current/mod/mod_log_config.html
The task manager now acts as its own statement executor so that a custom
statement executor can perform custom actions for KillQueryStatement and
ShowQueriesStatement.
The graphite service will attempt to create the retention policy and use
it. If the retention policy doesn't exist, it will be created with the
default options.
Fixes#5655.
This allows us to add additional options to ExecuteQuery without
creating parameter bloat.
Removing the unused Series structs. Their necessity was removed by a
previous commit, but the structs were not removed yet.
Add another type of interrupt iterator that monitors the interrupt
channel and calls `Close()` on the iterator when the interrupt happens.
It will primarily be used for asynchronously closing the ReaderIterator,
but it will only close the read side of the connection properly. More
work needs to be done to allow closing the write side efficiently.
The default retention policy name is changed to "autogen" instead of
"default" since it ends up being ambiguous when we tell a user to check
the default retention policy, it is uncertain if we are referring to the
default retention policy (which can be changed) or the retention policy
with the name "default".
Now the automatically generated retention policy name is "autogen".
The default retention policy is now also configurable through the
configuration file so an administrator can customize what they think
should be the default.
Fixes#3733.
Use the latest documentation by default if the server version can't be
found. If it can be found, update the documentation link to that
specific version so it always points at the exact documentation relevant
for the server version.
Fixes#5906.
The HTTPS configuration for the httpd service only had an option to
specify the certificate file and the same file would be used for both
the certificate and private key file (they could be concatenated
together).
This adds an additional option to specify the files differently from
each other while still allowing the previous behavior. If only
`https-certificate` is specified, the httpd service will try to load the
private key from the `https-certificate` file. If a separate
`https-private-key` file is specified, the private key will be loaded
from there instead.
Fixes#1310.
The parser can be passed a map of keys to literal values to be replaced
into the query. Parameters are preceded by a dollar sign (`$`). If a
parameter key is missing, an error is thrown by the parser.
Fixes#2926.
When authenticating a request, check that an admin user exists instead
of checking for len(users) > 0. This prevents getting stuck with no
admin user and being unable to create one.
The http connection limit is for any HTTP operation and is independent
of the other connection limits. It should be set to a higher value than
the query limit. The difference between this and the query limit is it
will close out the connection immediately without any further
processing.
This is the equivalent of the `max_connections` option in PostgreSQL.
Also removes some unused config options from the cluster config.
Fixes#6559.
In order to follow REST a bit more carefully, all write operations
should go through a POST in the future. We still allow read operations
through either GET or POST (similar to the Graphite /render endpoint),
but write operations will trigger a returned warning as part of the JSON
response and will eventually return an error.
Also updates the Golang client libraries to always use POST instead of
GET.
Fixes#6290.
It can sometimes be difficult to determine if a query submitted by
the UI has been accepted for execution.
In particular, if a query that returns empty results is followed
by a query that takes a long time, then the green success message
from the first query may be misleadingly displayed to the user when,
in fact, the second query is still in progress.
To address this, we always hide the query error and success divs
before the query is submitted. Previously, just the results were
cleared.
Secondly, if the user re-runs a query expecting slightly different results,
it can also be unclear whether the second attempt to execute the command
is simply redisplaying the results of the previous query or the results of the
resubmission, this is particularly true if the queries involved
have short execution times.
To support the ability to distinguish these two cases, we have any attempt
to use the history arrow also clear the results and status divs. This
reduces the ambiguity about whether the next results display is, in fact,
the result of executing a new query.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
Sanitizing is now done through pattern matching rather than parsing the
query and replacing the password in the query. This prevents
accidentally redacting the wrong part of a query and revealing what the
password is through association.
Fixes#3883.
This has various benefits:
- Users embedding InfluxDB within other Go programs can specify a different logger / prefix easily.
- More consistent with code used elsewhere in InfluxDB (e.g. services, other `run.Server.*` fields, etc).
- This is also more efficient, because it means `executeQuery` no longer allocates a single `*log.Logger` each time it is called.
The deprecated message is now attached to a new attribute returned with
the results. This message can then be read by clients to warn a user
about upcoming changes to the query engine.
The `influx` client has already been modified to read this message and
print it out for every format except CSV.
The first warning message is a deprecated message about removing `IF NOT
EXISTS` from `CREATE DATABASE`.
The message will also be printed to the server log.
Fixes#5707.