there were two problems with this code:
1. the send on pending did not imply that the handler was running
2. there was a race starting the handler with timing out
1 is fixed by sending to a begin channel inside the handler. it is
then guaranteed that the timeout handler code has been entered.
2 is fixed by attempting to acquire the semaphore channel once before
checking the timeout channel. in this way, if there is capacity, which
in this test there is known to be, it is guaranteed to be taken. if
we check with the timer at the same time and the timer has already
fired, there is a pseudorandom chance the timer will be taken even
if there is capacity.
* Update Prometheus remote write to use metric name as measurement name and value as the field name.
* Update Prometheus remote read to use the storage.Read method to bypass the InfluxQL query engine.
multiple users have attempted to run influxdb in a docker container
with a windows host and a volume mounted from windows. that causes
problems because it apparently uses samba/cifs which does not
support fsync on directories. this patchset will, if it receives an EINVAL
on directory fsync, as is what appears to happen on samba/cifs, then it
will ignore it. this should help.
fixes#9833.
fixes#9630.
This commit adds `debug-pprof-enabled` which will start the default
`net/http/pprof` endpoint and bind against `localhost:6060`. This
will help to debug startup performance issues.
This commit adds throttling to the HTTP write endpoints based on
queue depth and, optionally, timeout. Two queues exist: `enqueued`
and `current`. The `current` queue is the number of concurrent
requests that can be processed. The `enqueued` queue limits the
maximum number of requests that can be waiting to be processed.
If the timeout is exceeded or the `enqueued` queue is full then
a `"503 Service unavailable"` code is returned and the error is
logged.
By default these options are turned off.
If the columns change between series, it will now act as if it was a new
statement id and reprint the headers. This only happens with show
diagnostics at the moment and we shouldn't add this functionality
anywhere else anyway.
Along with modifying ExecutionContext to be a context and have the
TaskManager return the context itself, this also creates a Monitor
interface and exposes the Monitor through the Context. This way, we can
access the monitor from within the query.Select method and keep all of
the limits inside of the query package instead of leaking them into the
statement executor.
An eventual goal is to remove the InterruptCh from the IteratorOptions
and use the Context instead, but for now, we'll just assign the done
channel from the Context to the IteratorOptions so at least they refer
to the same channel.
Remove the `Query` prefix from some structs and interfaces. They were
there so when the query engine was in the same package as influxql,
these would be differentiated. Now that the package name is query, the
extra prefix seems redundant.
* Live Restore + Enterprise data format compatability
* Extended ImportData to import all DB's if no db name given
* Added a new enterprise data test, and backup command now prints the backup file paths at conclusion
* Added whole-system backup test
* Update to use protobuf in all enterprise data cases
* Update to test to do cross-testing with enterprise version
* incremental enterprise backup format support
* only call ParseTags when necessary
* remove dependency on inmem.Series in tsdb test package
* Measurement and Series are no longer exported. Their use is restricted
to the inmem package
* improve Measurement and Series types by exporting immutable
fields and removing unnecessary APIs and locks
Reduced startup time from 28s to 17s. Overall improvement including
#9162 reduces startup from 46s to 17s for 1MM series across 14 shards.
* did not handle cached values correctly
* sort shards by time in either ascending or descending
order depending on the RPC request ordering to ensure they
are traversed in the correct order.
The previous sha was taken from a revision on a devel branch that I
thought would continue staying in the tree after it was merged. That
revision was rebased away and the API was changed for the logger.
This updates the usage of the logger and adds a simple package for
constructing the base logger.
The 1.0 version of zap changed the format of the default console logger
so this change moves over to this new logger instead of attempting to
retain backwards compatibility with the old format.
TODO(sgc): implement `writer` type that handles all the details
of writing frames to the RPC stream. Additional responsibilities
of writer include
* point frame recycling to reduce memory pressure
* skip empty point frames
* skip series frames with no points
When a meta query does not include a time component then it can be
answered exclusively by the index. This should result in a much faster
query execution that if the TSM engine was engaged.
This commit rewrites the following queries such that they make use
of the index where no time component is present:
- SHOW MEASUREMENTS
- SHOW SERIES
- SHOW TAG KEYS
- SHOW FIELD KEYS
Fixes#8819.
Previously, the process of dropping expired shards according to the
retention policy duration, was managed by two independent goroutines in
the retention policy service. This behaviour was introduced in #2776,
at a time when there were both data and meta nodes in the OSS codebase.
The idea was that only the leader meta node would run the meta data
deletions in the first goroutine, and all other nodes would run the
local deletions in the second goroutine.
InfluxDB no longer operates in that way and so we ended up with two
independent goroutines that were carrying out an action that was really
dependent on each other.
If the second goroutine runs before the first then it may not see the
meta data changes indicating shards should be deleted and it won't
delete any shards locally. Shortly after this the first goroutine will
run and remove the meta data for the shard groups.
This results in a situation where it looks like the shards have gone,
but in fact they remain on disk (and importantly, their series within
the index) until the next time the second goroutine runs. By default
that's 30 minutes.
In the case where the shards to be removed would have removed the last
occurences of some series, then it's possible that if the database was already at its
maximum series limit (or tag limit for that matter), no further new series
can be inserted.
* Fprint* functions
* No nakedness
* clarify panic messages
* spacing between case statements
* remove break in favor of return
* remove goto in favor of for { continue }
All of these services start up goroutines and then wait for the
goroutines to finish. Each of them has a `tsdb.PointBatcher` that may
return a point during the shutdown sequence. During the shutdown
sequence, a lock was held. This lock may get accessed when attempting to
write the point that came back from the `tsdb.PointBatcher`. This caused
the read lock attempt to wait forever for the write lock to be unlocked
during `Close()`.
This modifies these methods so that the write lock is released while
waiting for goroutines to finish in these three services.
Adds a new package prometheus for converting from remote reads and writes to Influx queries and points. Adds two new endpoints to the httpd handler to support prometheus remote read at /api/v1/prom/read and remote write at /api/v1/prom/write.
The only thing used from Prometheus is the storage/remote files that are generated from the remote.proto file. Copied that file into promtheus/remote package to avoid an extra dependency.
There are several places in the code where comma-ok map retrieval was
being used poorly. Some were benign, like checking existence before
issuing an unconditional delete with no cleanup. Others were potentially
far more serious: assuming that if 'ok' was true, then the resulting
pointer retrieved from the map would be non-nil. `nil` is a perfectly
valid value to store in a map of pointers, and the comma-ok syntax is
meant for when membership is distinct from having a non-zero value.
There was only one or two cases that I saw that being used correctly for
maps of pointers.
Other applications or services sometimes expose a header containing a
unique ID, which can then be included in logging or response information
to allow an operator to link inter-service requests. The most common
header name used by services in the wild appears to be `X-Request-ID`,
but `Request-Id` is also used.
This commit adds support for specifying either `X-Request-ID` or
`Request-Id` headers, which will then be used by InfluxDB when logging
request information, and also in the `X-Request-ID` and `Request-Id`
response headers.
We populate both `X-Request-ID` and `Request-Id` to maintain backwards
compatibility with previous version, and to support the more common
`X-Request-ID` header name.
If both `X-Request-ID` and `Request-Id` are specified, then
`X-Request-ID` is used.
If neither header is specified, then in line with previous behaviour, we
generate a v1 UUID.
The ConditionExpr function is more accurate because it parses the
condition and ensures that time conditions are actually used correctly.
That means that attempting to combine conditions with OR will not result
in the query silently pretending it's an AND and nested conditions work
correctly so there is only one way to read the query.
It also extracts the non-time conditions into a separate condition so we
can stop attempting to parse around the time conditions in lower layers
of the storage engine. This change does not remove those hacks, but a
following commit should be able to sanitize the condition and remove
them.
This commit provides more insight into server errors by both setting
the error on a response header, and, in the case of server errors (5xx),
logging those error messages to the HTTPD log, if [http] log_enabled =
true.
This change provides a clear separation between the query engine
mechanics and the query language so that the language can be parsed and
dealt with separate from the query engine itself.
This calculates the start and end time along with any time zones shifts
so that continuous queries are run at the correct time when a time zone
is included in the query.
Removing the forced `Connection: close` header from the `/query`
endpoint. This was originally added because of golang/go#13165, but it
seems like it's possible to use pipelining with go 1.8 and http 1.1,
just not recommended.
After some testing, it appears that the channel returned by
`ResponseWriter.CloseNotify()` will not send a value if the connection
was not interrupted. We already account for this in /query by exiting
from the goroutine if the request has finished by signaling another
channel.
Since the handler already accounts for the possibility that the channel
will not signal and since `CloseNotify()` doesn't interfere with a
pipelined request, we can remove the forced `Connection: close` that was
added to force clients to establish a new connection.
The Go timestamp leads Truncate to start a week on Monday, but the query
engine uses unix timestamps which has the week start on a Thursday.
Updating the service so it uses a custom truncate method that uses the
unix timestamp instead of `time.Time`.
Fixes#8569.
This commit adds a new environment variable INFLUXDB_PANIC_CRASH, which
when set to a truthy value, e.g., true, TRUE, 1, will prevent the server
from recovering from a panic.
Recover currently occurs in two places: the HTTP handler and the
QueryExecutor. INFLUXDB_PANIC_CRASH will control both.
Further, this commit adds _internal stats that will monitor the
occurrence of panics all the time (regardless of if INFLUXDB_PANIC_CRASH
has been set to true or not).
The recovered panic frequency can be inspected with the following
queries:
SELECT "recoveredPanics" FROM "_internal"."monitor"."httpd";
SELECT "recoveredPanics" FROM "_internal"."monitor"."queryExecutor";
* off by default, enabled by `query-stats-enabled`
* writes to cq_query measurement of configured monitor database
* see CHANGELOG for schema of individual points
* fix issue when panicking (before Write) gzip writer is closed, causing
header to be written and default status of 200 OK being written.
* update recovery middleware to set 500 Internal Server Error
Currently, when debugging issues with InfluxDB we often ask for the
following profiles:
curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1"
curl -o goroutine.txt
"http://localhost:8086/debug/pprof/goroutine?debug=1"
curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"
curl -o cpu.txt "http://localhost:8086/debug/pprof/profile
This can be bothersome for users, or even difficult if they're
unfamiliar with cURL (or it's not on their system).
This commit adds a new endpoint: /debug/pprof/all which will return a
single compressed archive of all of the above profiles. The CPU profile
is optional, and not returned by default. To include a CPU profile the
URL to request should be: /debug/pprof/all?cpu=true. It's also possible
to vary the length of the CPU profile by adding a `seconds=x` parameter,
where x defaults to 30, if absent.
The new command for gathering profiles from users should now be:
curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all"
Or, if we need to see a CPU profile:
curl -o profiles.tar.gz
"http://localhost:8086/debug/pprof/all?cpu=true"
It's important to remember that a CPU profile is a blocking operation
and by default it will take 30 seconds for the response to be returned
to the user.
Finally, if the user is unfamiliar with cURL, they will now be able to
visit http://localhost:8086/debug/pprof/all in a web browser, and the
archive will be downloaded to their machine.
Measurement name and field were converted between []byte and string
repetively causing lots of garbage. This switches the code to use
[]byte in the write path.
After using `/debug/requests`, the client will wait for 30 seconds
(configurable by specifying `seconds=` in the query parameters) and the
HTTP handler will track every incoming query and write to the system.
After that time period has passed, it will output a JSON blob that looks
very similar to `/debug/vars` that shows every IP address and user
account (if authentication is used) that connected to the host during
that time.
In the future, we can add more metrics to track. This is an initial
start to aid with debugging machines that connect too often by looking
at a sample of time (like `/debug/pprof`).
This commits adds a caching mechanism to the Data object, such that
when large numbers of users exist in the system, the cost of determining
if there is at least one admin user will be low.
To ensure that previously marshalled Data objects contain the correct
cached admin user value, we exhaustively determine if there is an admin
user present whenever we unmarshal a Data object.
Instead of incrementing the `queryOk` statistic with or without the
continuous query running, it will only increment when the query is
actually executed.
max-row-limit was set at 10000 since 1.0, but due to a bug it was
effectively 0 (disabled). 1.2 fixed this bug via #7368, but this
caused a breaking change w/ Grafana and any users upgrading from <1.2
who had not disabled the config manually.
They rebased a revision we were previously relying upon that allowed us
to use the vanity name so we are reverting back to an older version with
the old import path.
This commit introduces a new interface type, influxql.Authorizer, that
is passed as part of a statement's execution context and determines
whether the context is permitted to access a given database. In the
future, the Authorizer interface may be expanded to other resources
besides databases. In this commit, the Authorizer interface is
specifically used to determine which databases are returned when
executing SHOW DATABASES.
When HTTP authentication is enabled, the existing meta.UserInfo struct
implements Authorizer, meaning admin users can SHOW every database, and
non-admin users can SHOW only databases for which they have read and/or
write permission.
When HTTP authentication is disabled, all databases are visible through
SHOW DATABASES.
This addresses a long-standing issue where Chronograf or Grafana would
be unable to list databases if the logged-in user did not have admin
privileges.
Fixes#4785.
This change adds some very basic name validation with the following
plain-english description: names must be non-zero sequence of printable
characters that do not contain slashes ('/' or '\') and are not equal to
either "." or "..".
The intent is that, since we currently just use database and retention
policy names directly as path elements, these rules will hopefully leave
us with names that should be at least close to valid directory names.
Ideally, we would restrict names even further or not use them as path
elements directly, but this should be a step towards the former without
restricting names "too much"
This adds query syntax support for subqueries and adds support to the
query engine to execute queries on subqueries.
Subqueries act as a source for another query. It is the equivalent of
writing the results of a query to a temporary database, executing
a query on that temporary database, and then deleting the database
(except this is all performed in-memory).
The syntax is like this:
SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *)
This will execute derivative and then sum the result of those derivatives.
Another example:
SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host)
This would let you find the maximum minimum value of each host.
There is complete freedom to mix subqueries with auxiliary fields. The only
caveat is that the following two queries:
SELECT mean(value) FROM cpu
SELECT mean(value) FROM (SELECT value FROM cpu)
Have different performance characteristics. The first will calculate
`mean(value)` at the shard level and will be faster, especially when it comes to
clustered setups. The second will process the mean at the top level and will not
include that optimization.
There was no comment on either case specifying that the `return nil`
was deliberate instead of `return err`, so I'm assuming these were
typos. I added tests to conserve the error-returning behavior.
It looks like the real import path to the project is go.uber.org/zap
instead of github.com/uber-go/zap since the example in the project
references that path.
The logging library has been switched to use uber-go/zap. While the
logging has been changed to use structured logging, this commit does not
change any of the logging statements to take advantage of the new
structured log or new log levels. Those changes will come in future
commits.
The url must have a scheme of udp,http,https and a port number.
CREATE SUBSCRIPTION will fail if there are invalid destinations.
Additionally Service.createSubscription fail invalid destinations are detected.
Fixes#7615
The `partial` tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.
This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the `partial` field included with the
response to know if it should expect more.
Fixed `max-row-limit` so it counts rows instead of results and it
truncates the response when the `max-row-limit` is reached.