Adds a new package prometheus for converting from remote reads and writes to Influx queries and points. Adds two new endpoints to the httpd handler to support prometheus remote read at /api/v1/prom/read and remote write at /api/v1/prom/write.
The only thing used from Prometheus is the storage/remote files that are generated from the remote.proto file. Copied that file into promtheus/remote package to avoid an extra dependency.
Deleting high cardinality series could take a very long time, cause
write timeouts as well as dead lock the process. This fixes these
issue to by changing the approach for cleaning up the indexes and
reducing lock contention.
The prior approach delete each series and updated every index (inmem)
during the delete. This was very slow and cause the index to be locked
while it items in a slice were removed one by one. This has been changed
to mark series as deleted and then rebuild the index asynchronously which
speeds up the process.
There was also a dead lock that could occur when deleing the field set.
Deleting the field set held a write lock and the function it invoked under
the lock could try to take a read lock on the field set. This would then
deadlock. This approach was also very slow and caused time out for writes.
It now uses faster approach that checks for the existing of the measurment
in the cache and filestore which does not take write locks.
```
curl -G http://localhost:8086/query?pretty=true --data-urlencode "db=mydb" \
--data-urlencode "q=SELECT * FROM cpu WHERE host='server01' AND time < now() - 1d"
```
Running this on ubuntu will get an error:
```
curl -G http://localhost:8086/query?pretty=true --data-urlencode "db=mydb" --data-urlencode "q=SELECT mean(load) FROM cpu WHERE region='uswest'"
No matches for wildcard “http://localhost:8086/query?pretty=true”.
fish: curl -G http://localhost:8086/query?pretty=true --data-urlencode "db=mydb" --data-urlencode "q=SELECT mean(load) FROM cpu WHERE region='uswest'"
^
curl: no URL specified!
curl: try 'curl --help' or 'curl --manual' for more information
```
for url here is not enclosed with `'`.
It prints the statistics of each iterator that will access the storage
engine. For each access of the storage engine, it will print the number
of shards that will potentially be accessed, the number of files that
may be accessed, the number of series that will be created, the number
of blocks, and the size of those blocks.
This is used quite a bit to determine which fields are needed in a
condition. When the condition gets large, the memory usage begins to
slow it down considerably and it doesn't take care of duplicates.
influx_inspect walks the data and wal directories building a list of
files to export. It then opens, reads, and exports each. If the file was
deleted between the time it was added to the list and the time the
inspect tool attempts to read it, the file is now skipped without
emitting an error.
There are several places in the code where comma-ok map retrieval was
being used poorly. Some were benign, like checking existence before
issuing an unconditional delete with no cleanup. Others were potentially
far more serious: assuming that if 'ok' was true, then the resulting
pointer retrieved from the map would be non-nil. `nil` is a perfectly
valid value to store in a map of pointers, and the comma-ok syntax is
meant for when membership is distinct from having a non-zero value.
There was only one or two cases that I saw that being used correctly for
maps of pointers.
Previously, subqueries would honor their own ordering. We never really
supported that and I have no idea if it would work since most parts in
the query engine assume that points are being delivered in only one
ordering.
Subqueries have now been modified so if a person tries to do different
ordering, they get an error when running the query. If they specify an
ordering in the top most query, that ordering gets propagated to all
subqueries.
Fixes#8699.
Now, the prepared statement keeps the open resource and closing the open
resource created from `Prepare` is the responsibility of the prepared
statement.
This also nils out the local shard mapping after it is closed to prevent
it from being used after it is closed.
This refactors the validation code so it is more flexible and performs a
small bit of work to make preparing and executing the query easier.
The general idea is that compilation will eventually do more heavy
lifting in creating the initial plan and prepare will construct an
actual plan rather than just doing some basic field rewriting.
This change at least sets us up for that change in the future and moves
the validation code to the query execution instead of in the parser.
This also frees up the parser to parse the complete AST without worrying
if the query itself is valid. That could be useful for client code that
wants to compile a partial query to an AST and then perform
modifications on the AST for some reason.
The first call is to compile the query. This performs some initial
processing that can be done before having any access to the shards. At
the moment, it does very little, but it's intended to be changed to
eventually perform initial validations of the query and create an
internal graph structure for the execution of the query.
The second call is to prepare the query. This step has access to the
shard mapper. Right now, it just maps the shards and rewrites the fields
of the query for any wildcards. In the future, it is intended to do the
above, but also to prepare the final directed acyclical graph that will
execute the query.
The third call is to select the query. This step is intended to create
all of the iterators for processing the query. At the moment, much of
the work intended for the second step is performed in the third step.