* Extend storage service protobuf with TagKeys and TagValues
Co-authored-by: Michael Desa <mjdesa@gmail.com>
Co-authored-by: Jacob Marble <jacobmarble@influxdata.com>
* Extend the reads.Store interface with new TagKeys and TagValues APIs
* Extend readservice.store to implement refactored reads.Store interface
* Implement a StringIterator gRPC writer / serializer
* Implement a StringIterator gRPC reader / deserializer
* Implement a StringIterator merger
Storage converts references to _value in filter predicates to $.
It then considers any predicate that does not reference $ to be
a tag predicate. Tag predicates are used to construct series index
cursors.
This commit fixes a bug where field comparisons were being included
in tag predicates due to the HasFieldValueKey function searching
for comparison expressions referencing _value instead of $. Because
all references to _value had already been replaced. An expression
of the form '$ < 3000' would be considered a tag expression and
therefore would be mistakenly included as a tag predicate.
Fixes#13159.
A cursor can be in process (local) or remote. Remote cursors have
already applied the `hasPoints` check, to reduce network traffic.
Testing whether the cursor has points again here:
https://github.com/influxdata/influxdb/blob/master/storage/reads/reader.go#L221
will always return `false` for a remote cursor that has applied the
NoPoints optimization.
This temporary fix allows the `hasPoints` function to differentiate
a streamCursor and always return true in that case.
It is possible a StreamReader (gRPC) may return an empty response. This
change adds retry and bail-out support. When a bail-out occurs,
reads.ErrStreamNoData is returned.
StorageReadClient adapts a gRPC Storage_ReadClient to provide
cursors.CursorStats by reading the trailer after receiving the final
message from the stream.
Because the WAL relies on the tsm1.Value type, we move that into its own
tsm1/value package and set up some aliases forwarding them into tsm1. This
also required adding some methods and changing consumers to avoid the
unexported fields. I imagine this step will be useful one day when we make
the write path more efficient with respect to consuming points.
This commit additionally fixes some issues with generation. The iterator.tmpldata
and generation for array_cursor_* were removed accidentally when removing
iterators, making those generated files stale. Restore that and regenerate.
No change in functionality.
I did this with a dumb editor macro, so some comments changed too.
Also rename root package from platform to influxdb.
In interest of minimizing risk, anyone importing the root package has
now aliased it to "platform" so that no changes beyond imports were
necessary in those files.
Lastly, replace the old platform module to local path /dev/null so that
nobody can accidentally reintroduce a platform dependency while
migrating platform code to influxdb.
A standard Makefile is used now in all subdirs that run go generate.
Make will only generate the file if its source files changed.
The checkgenerate target runs clean to ensure all targets a generated
fresh.
The table interface was modified to expose the arrow buffers. The
storage table has now been converted to use this interface with the same
fixes so that it exposes arrow buffers.
The influxql package has also been updated to use the `DoArrow` method
from the `flux.Table` interface.
Moves the check to determine if a series has data for the current time
range into the groupBySort function, which is consistent with the
groupNoneSort function.
tl;dr
Previously, `Close` was being called concurrently by multiple
goroutines, resulting in a race condition. This commit resolves those
issues.
Background
The `Close` method was performing multiple duties, closing resources
and triggering that the table reading by the `Do` method was done.
Additionally, state to track whether more records existed and if the
table was empty, was ported from the more complicated gRPC
implementation. This logic has been simplified.
This new behavior:
* `table#Do` is responsible for triggering it is done, by closing the
done channel
* The creator of the `table` is responsible for releasing the resources
by calling the `table#Close` method
* The `table#Do` reading can be cancelled by calling the `Cancel`
function, which is safe for concurrent use.
* the Do and Close methods are protected by a mutex to protect storage
resources, such as cursors.
This commit removes the remaining bits of the fields index. In doing
so, the buildCursor method on the engine would need to be updated.
It turns out, that code was statically dead, so delete it and anything
that depended on it. Additionally, delete anything as reported by
the unused tool in the tsdb package.
The generate commands have been modified to take advantage of the new
functionality in Go 1.11 that allows `go run` to execute a package
instead of individual files.
This functionality combined with Go modules allows us to execute a
package directly out of our pinned dependencies rather than accidentally
picking up another binary outside of the build environment.
This also simplifies the Makefile because they no longer have to be
responsible for installing the correct tooling since the Go command
takes care of that logic. It also makes it so that the Makefiles with
file generation can now be invoked from their appropriate subdirectories
so they are contained within the directory itself rather than relying on
values in the top level Makefile.
It is now possible to generate all files within this project by using:
go generate ./...
Or the Makefile can continue to be used.
This commit also copies over the special copy of `tmpl` that the storage
engine uses within the influxdb repository. It was never copied over so
using `go generate` on these packages did not work.
We are moving the necessary code for 2.0 from the influxdb 1.X
repository to the platform 2.0 repository. The logger is an unnecessary
dependency on the old influxdb that is making life more complicated.
This pulls in the code that allows doing reads with flux into the
platform repo, and removes extra.go.
The reusable portion is under storage/reads, where the concrete
implementation for one of the platform's engines is in
storage/readservice.
In order to make this more reusable, the cursors had to move into
their own package, decoupling it from all of the other code in the
tsdb package. tsdb/cursors is this new package, and type/function
aliases have been added to the tsdb package to point at it.
The models package already is very light on transitive dependencies
and so it was allowed to be depended on in a concrete way in the
cursors package.
Finally, the protobuf definitions for issuing GRPC reads has been
moved into its own package for two reasons:
1. It's a clean separation, and helps keep it that way.
2. Many/most consumers will not be using GRPC. We just
use the datatypes to express the API which helps making
a GRPC server easier.
It is left up to future refactorings (specifically ones that involve
GPRC) to determine if these types should remain, or if there is a
cleaner way.
There's still some dependencies on both github.com/influxdata/influxql
and github.com/influxdata/influxdb/logger that we can hopefully remove
in future refactorings.