* Add AuthorizeDatabase API to QueryAuthorizer to verify a user has
appropriate access to the specified database
* Update serverFluxQuery handler to require a meta.User when auth is
enabled
* update Flux createFromSource and createBucketsSource dependencies to
require Authorizer when auth is enabled in configuration
* update createFromSource to verify read permissions for each bucket
specified in a Flux query
* update BucketsDecoder, which implements the buckets() Flux function,
to return buckets that the user has read or write permissions to
* add unit tests to verify authentication is required for Flux HTTP
requests when auth is enabled in configuration
The access log filter allows the access log to be filtered by a status
code pattern. The pattern is a list of strings of the form `NXX`. At
least one number must be specified and up to 2 Xs can be used. That
means the filter can be an exact status code or it can be a range of
them. For example, `500` would only match the 500 http status code while
`5XX` would match any status code beginning with the number 5
(categorized as server errors). The pattern `50X` would also be
accepted. Both uppercase and lowercase Xs are allowed.
Multiple filters can be specified and the log line will be printed if
one of them matches. If there are no filters specified, all status codes
are printed.
This commit deletes most of the code to service reads from influxdb
and pulls it in from platform instead.
Of note, the models.Tag and models.Tags types are now aliases to the
platform models.Tag and models.Tags types. Additionally, many types
in the tsdb package relating to cursors are also aliases to the same
types in the platform cursors package.
This updates the platform and flux repos to the current master in the
Gopkg.lock.
* the protocol service definition, ReadRequest and ReadResponse is
reused across projects, rather than requiring redefinition.
* the ReadRequest protocol buffer definition removes the concept of a
database and retention policy, replacing it with a field named
ReadSource of type google.protobuf.Any. OSS requests will use the
ReadSource message structure defined in local to this package, which
defines fields to represent a Database and RetentionPolicy. Other
implementations can provide their own data structure allowing the
remainder of the ReadRequest to be reused.
* The RPC service and Store are expected to be redefined to handle their
specific requirements for resolving a ReadSource
* ResultSet and GroupResultSet are interfaces representing non-grouping
and grouping read behavior respectively. Calling NewResultSet or
NewGroupResultSet will construct instances of these types
* The ResponseWriter type is exported to deal with serialization of
the ResultSet and GroupResultSet types
there were two problems with this code:
1. the send on pending did not imply that the handler was running
2. there was a race starting the handler with timing out
1 is fixed by sending to a begin channel inside the handler. it is
then guaranteed that the timeout handler code has been entered.
2 is fixed by attempting to acquire the semaphore channel once before
checking the timeout channel. in this way, if there is capacity, which
in this test there is known to be, it is guaranteed to be taken. if
we check with the timer at the same time and the timer has already
fired, there is a pseudorandom chance the timer will be taken even
if there is capacity.
* Update Prometheus remote write to use metric name as measurement name and value as the field name.
* Update Prometheus remote read to use the storage.Read method to bypass the InfluxQL query engine.
This commit adds throttling to the HTTP write endpoints based on
queue depth and, optionally, timeout. Two queues exist: `enqueued`
and `current`. The `current` queue is the number of concurrent
requests that can be processed. The `enqueued` queue limits the
maximum number of requests that can be waiting to be processed.
If the timeout is exceeded or the `enqueued` queue is full then
a `"503 Service unavailable"` code is returned and the error is
logged.
By default these options are turned off.
Remove the `Query` prefix from some structs and interfaces. They were
there so when the query engine was in the same package as influxql,
these would be differentiated. Now that the package name is query, the
extra prefix seems redundant.
The previous sha was taken from a revision on a devel branch that I
thought would continue staying in the tree after it was merged. That
revision was rebased away and the API was changed for the logger.
This updates the usage of the logger and adds a simple package for
constructing the base logger.
The 1.0 version of zap changed the format of the default console logger
so this change moves over to this new logger instead of attempting to
retain backwards compatibility with the old format.
Adds a new package prometheus for converting from remote reads and writes to Influx queries and points. Adds two new endpoints to the httpd handler to support prometheus remote read at /api/v1/prom/read and remote write at /api/v1/prom/write.
The only thing used from Prometheus is the storage/remote files that are generated from the remote.proto file. Copied that file into promtheus/remote package to avoid an extra dependency.
There are several places in the code where comma-ok map retrieval was
being used poorly. Some were benign, like checking existence before
issuing an unconditional delete with no cleanup. Others were potentially
far more serious: assuming that if 'ok' was true, then the resulting
pointer retrieved from the map would be non-nil. `nil` is a perfectly
valid value to store in a map of pointers, and the comma-ok syntax is
meant for when membership is distinct from having a non-zero value.
There was only one or two cases that I saw that being used correctly for
maps of pointers.
Other applications or services sometimes expose a header containing a
unique ID, which can then be included in logging or response information
to allow an operator to link inter-service requests. The most common
header name used by services in the wild appears to be `X-Request-ID`,
but `Request-Id` is also used.
This commit adds support for specifying either `X-Request-ID` or
`Request-Id` headers, which will then be used by InfluxDB when logging
request information, and also in the `X-Request-ID` and `Request-Id`
response headers.
We populate both `X-Request-ID` and `Request-Id` to maintain backwards
compatibility with previous version, and to support the more common
`X-Request-ID` header name.
If both `X-Request-ID` and `Request-Id` are specified, then
`X-Request-ID` is used.
If neither header is specified, then in line with previous behaviour, we
generate a v1 UUID.
This commit provides more insight into server errors by both setting
the error on a response header, and, in the case of server errors (5xx),
logging those error messages to the HTTPD log, if [http] log_enabled =
true.
This change provides a clear separation between the query engine
mechanics and the query language so that the language can be parsed and
dealt with separate from the query engine itself.
Removing the forced `Connection: close` header from the `/query`
endpoint. This was originally added because of golang/go#13165, but it
seems like it's possible to use pipelining with go 1.8 and http 1.1,
just not recommended.
After some testing, it appears that the channel returned by
`ResponseWriter.CloseNotify()` will not send a value if the connection
was not interrupted. We already account for this in /query by exiting
from the goroutine if the request has finished by signaling another
channel.
Since the handler already accounts for the possibility that the channel
will not signal and since `CloseNotify()` doesn't interfere with a
pipelined request, we can remove the forced `Connection: close` that was
added to force clients to establish a new connection.
This commit adds a new environment variable INFLUXDB_PANIC_CRASH, which
when set to a truthy value, e.g., true, TRUE, 1, will prevent the server
from recovering from a panic.
Recover currently occurs in two places: the HTTP handler and the
QueryExecutor. INFLUXDB_PANIC_CRASH will control both.
Further, this commit adds _internal stats that will monitor the
occurrence of panics all the time (regardless of if INFLUXDB_PANIC_CRASH
has been set to true or not).
The recovered panic frequency can be inspected with the following
queries:
SELECT "recoveredPanics" FROM "_internal"."monitor"."httpd";
SELECT "recoveredPanics" FROM "_internal"."monitor"."queryExecutor";
* fix issue when panicking (before Write) gzip writer is closed, causing
header to be written and default status of 200 OK being written.
* update recovery middleware to set 500 Internal Server Error
Currently, when debugging issues with InfluxDB we often ask for the
following profiles:
curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1"
curl -o goroutine.txt
"http://localhost:8086/debug/pprof/goroutine?debug=1"
curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"
curl -o cpu.txt "http://localhost:8086/debug/pprof/profile
This can be bothersome for users, or even difficult if they're
unfamiliar with cURL (or it's not on their system).
This commit adds a new endpoint: /debug/pprof/all which will return a
single compressed archive of all of the above profiles. The CPU profile
is optional, and not returned by default. To include a CPU profile the
URL to request should be: /debug/pprof/all?cpu=true. It's also possible
to vary the length of the CPU profile by adding a `seconds=x` parameter,
where x defaults to 30, if absent.
The new command for gathering profiles from users should now be:
curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all"
Or, if we need to see a CPU profile:
curl -o profiles.tar.gz
"http://localhost:8086/debug/pprof/all?cpu=true"
It's important to remember that a CPU profile is a blocking operation
and by default it will take 30 seconds for the response to be returned
to the user.
Finally, if the user is unfamiliar with cURL, they will now be able to
visit http://localhost:8086/debug/pprof/all in a web browser, and the
archive will be downloaded to their machine.
After using `/debug/requests`, the client will wait for 30 seconds
(configurable by specifying `seconds=` in the query parameters) and the
HTTP handler will track every incoming query and write to the system.
After that time period has passed, it will output a JSON blob that looks
very similar to `/debug/vars` that shows every IP address and user
account (if authentication is used) that connected to the host during
that time.
In the future, we can add more metrics to track. This is an initial
start to aid with debugging machines that connect too often by looking
at a sample of time (like `/debug/pprof`).
This commits adds a caching mechanism to the Data object, such that
when large numbers of users exist in the system, the cost of determining
if there is at least one admin user will be low.
To ensure that previously marshalled Data objects contain the correct
cached admin user value, we exhaustively determine if there is an admin
user present whenever we unmarshal a Data object.
They rebased a revision we were previously relying upon that allowed us
to use the vanity name so we are reverting back to an older version with
the old import path.
This commit introduces a new interface type, influxql.Authorizer, that
is passed as part of a statement's execution context and determines
whether the context is permitted to access a given database. In the
future, the Authorizer interface may be expanded to other resources
besides databases. In this commit, the Authorizer interface is
specifically used to determine which databases are returned when
executing SHOW DATABASES.
When HTTP authentication is enabled, the existing meta.UserInfo struct
implements Authorizer, meaning admin users can SHOW every database, and
non-admin users can SHOW only databases for which they have read and/or
write permission.
When HTTP authentication is disabled, all databases are visible through
SHOW DATABASES.
This addresses a long-standing issue where Chronograf or Grafana would
be unable to list databases if the logged-in user did not have admin
privileges.
Fixes#4785.
It looks like the real import path to the project is go.uber.org/zap
instead of github.com/uber-go/zap since the example in the project
references that path.
The logging library has been switched to use uber-go/zap. While the
logging has been changed to use structured logging, this commit does not
change any of the logging statements to take advantage of the new
structured log or new log levels. Those changes will come in future
commits.
The `partial` tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.
This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the `partial` field included with the
response to know if it should expect more.
Fixed `max-row-limit` so it counts rows instead of results and it
truncates the response when the `max-row-limit` is reached.
When the `max-row-limit` was hit, the goroutine reading from the results
channel would stop reading from the channel, but it didn't signal to the
sender that it was no longer reading from the results. This caused the
sender to continue trying to send results even though nobody would ever
read it and this created a deadlock.
Include an `AbortCh` on the `ExecutionContext` that will signal when
results are no longer desired so the sender can abort instead of
deadlocking.
This changes the behavior of the max-series-per-database and
max-values-per-tag limits to drop points that would exceed the limits
and allow the remaining points to be written. Previously, the whole
batch would fail and return and 500 error to the client.
This now will write the allow points and return a `partial write`
error indicating some of the points were dropped, how many were
dropped and one of the problem measureent and tags.
When we refactored expvar, the cmdline and memstats sections were not
readded to the output. This adds it back if they can be found inside of
`expvar`.
It also stops trying to sort the output of the statistics so they get
returned faster. JSON doesn't need them to be sorted and it causes
enough latency problems that sorting them hurts performance.
Previously, we implicitly added a newline and had to add one to the
number of bytes transmitted because we added that byte. That was removed
at some point and the metric was not updated to record the correct
value.
The query killing functionality depends on the ResponseWriter exposing a
CloseNotify method. Since we wrap the http.ResponseWriter, the new
struct does not have that method and the HTTP handler would skip past
calling that method.
Instead of duplicating `Flush()` and `CloseNotify()` for every response
formatter, we will unify all of that under a single struct and create
formatters instead.
Also, fixes a bug where the header information from a query would not be
returned until some other data was returned with it because of
buffering and another bug in the gzipResponseWriter that wouldn't flush
the actual underlying ResponseWriter.
The query can be uploaded from a file using `multipart/form-data` and
setting the file name to `q`. An example of using curl to execute an
async query would be:
curl -F "q=@database.iql" -F "async=true" http://localhost:8086/query
It will return a 204 No Content as long as the query is accepted
(immediate errors will be returned, but not individual errors with
specific queries). The only way to kill the query is by using the task
manager.
According to the HTTP standard, a lack of authentication credentials or
incorrect authentication credentials should send back a 401
(Unauthorized) with a `WWW-Authenticate` header with a challenge that
can be used to authenticate. This is because a 401 status should be sent
when an authentication attempt can be retried by the browser.
The 403 (Forbidden) status code should be sent when authentication
succeeded, but the user does not have the necessary authorization.
Previously, the server would always send a 401 status code.
Truncate the time interval output of the monitor service to be on even
time intervals rather than on every minute based on the start time. This
normalizes the output from the monitor service.
changes the httpd log lines from this:
[httpd] 2016/06/08 14:06:39 ::1 - - [08/Jun/2016:14:06:39 +0100] POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1 204 0 - InfluxDBClient d6aa01fc-2d79-11e6-8024-000000000000 2.751391ms
to this:
[httpd] ::1 - - [08/Jun/2016:14:06:39 +0100] "POST /write?consistency=any&db=telegraf&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" d6aa01fc-2d79-11e6-8024-000000000000 2751
So it changes a few things:
1. Remove the logger timestamp at the beginning which isn't very relevant anyways
2. adds quotes around "METHOD URI PROTOCOL", because this is part of the
common log format.
3. adds quotes around "AGENT" and "REFERRER" because this is part of the
"combined" log format.
4. Puts the response time in integer microseconds, because this is
consistent with apache's %D config mod option.
Compared with CLF, our logs now look like this:
[httpd] %{COMMON_LOG_FORMAT} "<agent>" "<referrer>" <request_uuid> <response_time_µs>
For reference, see:
https://en.wikipedia.org/wiki/Common_Log_Formathttp://httpd.apache.org/docs/current/mod/mod_log_config.html
This allows us to add additional options to ExecuteQuery without
creating parameter bloat.
Removing the unused Series structs. Their necessity was removed by a
previous commit, but the structs were not removed yet.
Add another type of interrupt iterator that monitors the interrupt
channel and calls `Close()` on the iterator when the interrupt happens.
It will primarily be used for asynchronously closing the ReaderIterator,
but it will only close the read side of the connection properly. More
work needs to be done to allow closing the write side efficiently.
The parser can be passed a map of keys to literal values to be replaced
into the query. Parameters are preceded by a dollar sign (`$`). If a
parameter key is missing, an error is thrown by the parser.
Fixes#2926.
When authenticating a request, check that an admin user exists instead
of checking for len(users) > 0. This prevents getting stuck with no
admin user and being unable to create one.