Commit Graph

235 Commits (fdb25560c497ce20ce52423daa5b8c70e4b2a5d2)

Author SHA1 Message Date
Stuart Carnie bf774b66ce
fix(storage): Ensure Tag(Keys|Values) APIs never return (nil, nil)
Formalized this post condition in the documentation and added additional
unit tests.

Added a nil guard and unit test to WriteStringIterator.
2019-05-02 09:45:38 -07:00
Lorenzo Affetti 0993a9f15b fix(readservice): normalize special tag keys after reducing request predicate 2019-05-02 16:55:08 +02:00
Stuart Carnie d858bd6f77
fix(storage): Sort keys were incorrectly sorted when concatenated
This manifested as incorrect sort ordering when serialized via RPC,
resulting in an `invalid partition key order` error.

This fix introduces a delimiter to ensure sort keys cannot collide.
2019-05-01 13:37:28 -07:00
Jeff Wendling 16e9eb4cb9 tsdb: respond to feedback and improve test coverage
predicate.go:
	UnmarshalPredicate       100.0%
	NewProtobufPredicate     100.0%
	Matches                  100.0%
	Marshal                  100.0%
	walkPredicateNodes       100.0%
	buildPredicateNode       100.0%
	newPredicateState        100.0%
	Reset                    100.0%
	Set                      100.0%
	newPredicateCache        100.0%
	Cached                   100.0%
	Store                    100.0%
	Update                   100.0%
	Update                   92.9%
	Update                   94.1%
	predicateEval            90.9%
	predicatePopTag          100.0%
	predicatePopTagEscape    100.0%
2019-05-01 13:40:40 -06:00
Jeff Wendling 4b4a814d7d storage: fix predicate matching on field tags 2019-05-01 13:40:40 -06:00
Jeff Wendling e84d4625a5 storage: add predicate deletes to the engine interface 2019-05-01 13:40:40 -06:00
Jeff Wendling e10939b8af storage: add predicate tracking to the WAL 2019-05-01 13:40:40 -06:00
Jeff Wendling 7403fd8aa9 tsm1: rename engine method to DeletePrefixRange
The storage/engine knows about buckets, but the tsm1/engine doesn't, so
name the tsm1/engine method Prefix and keep the storage/engine named
Bucket.
2019-05-01 13:40:40 -06:00
jlapacik 5d90683b07 refactor(storage): remove no points tables and streamReader interface
These tables were previously used to perform meta queries.
Meta queries are now answered using a specific API, and as
a result, these tables can go away.
2019-05-01 10:35:10 -07:00
jlapacik 95aa194498 feat(storage): ReadGroup RPC definitions and storage reader 2019-05-01 10:35:10 -07:00
Stuart Carnie f56b4ef020
Merge pull request #13723 from influxdata/sgc/fix/merge
Ensure GroupCursor Keys is union of keys from all GroupCursors of current partition key
2019-05-01 09:07:53 -07:00
Jacob Marble 8c269e0153
chore(log): Put trace_id back in logs (#13712)
* chore(log): Put trace_id back in logs

* fix tests
2019-04-30 18:51:22 -07:00
Stuart Carnie 96c2282aab
fix(query): Keys must be union of all keys from all GroupCursors 2019-04-30 15:49:36 -07:00
Jonathan A. Sternberg e181edd077
fix(storage/reads): translate measurement and field tag key names (#13707)
Translate the measurement and field tag key names to their non-storage
names and add the `_start` and `_stop` tag keys to the output since
they aren't real tags, but ones that are added by range.
2019-04-29 18:11:20 -05:00
Jonathan A. Sternberg 96a76aad1d
fix(storage/reads): reserve data for the tags column when building a table (#13691) 2019-04-29 14:28:25 -05:00
Kelvin Wang ea54e2c2c8 fix(kv): fix empty org name 2019-04-26 18:16:28 -04:00
Stuart Carnie fb39ac39ce
fix(storage): Store.Read behavior changed to return unsorted series keys
Closes #13581
2019-04-26 10:38:59 -07:00
Jonathan A. Sternberg 46d2d0012b fix(storage): translate _measurement and _field to the proper strings (#13662)
The RPC call should translate `_measurement` and `_field` to their
proper shortened byte strings when requesting the tag values.

This also fixes the planner rewrites to return the root node even when
no rewrite happened as this is required by the planner.
2019-04-26 10:36:51 -07:00
Stuart Carnie ed344d25f8
feat(storage): Teach storage how to find a distinct set of tag keys
The TagValues API will perform a linear scan if there is no predicate;
otherwise, it will use the index to find a list of candidate series
keys.

TagKeys expects the predicate to be transformed such that
`_measurement` and `_field` are remapped to `\x00` and `\xff`
respectively.

There is one TODO marked to analyze the predicate for a
`\x00 = '<measurement>'` pattern. If found, the predicate can be
eliminated and fall back to a linear prefix scan by combining the org,
bucket and measurement. This is tracked by issue #13497.
2019-04-24 11:14:22 -07:00
Ben Johnson 01bfcf822b
Merge point parse & explode (#12377)
Merge point parse & explode
2019-04-24 10:30:16 -06:00
Ben Johnson 272f340c30
Merge point parse & explode. 2019-04-24 10:12:15 -06:00
Jonathan A. Sternberg 5e77bd1e28
feat(query): implement the read tag values rpc call in the query engine (#13559)
If a pattern is seen that matches the `v1.tagValues(...)` call, then it
will be replaced with a direct RPC call to read the tag values for the
selected tag key which should be better optimized than reading from the
storage engine tsm1 files.
2019-04-23 12:56:35 -05:00
kelwang 0916741838
Merge pull request #13286 from influxdata/bucket_scrapper_org_rename
Bucket scrapper org rename
2019-04-22 19:46:51 -04:00
Jacob Marble bc2650813d
chore(storage): Merge StringIterator cursor stats (#13558)
* chore(storage): Merge StringIterator cursor stats

* add unit test

* properly count stats after iteration
2019-04-22 15:39:21 -07:00
Kelvin Wang 7a72c363f2 remove org from bucket 2019-04-22 18:39:05 -04:00
Jonathan A. Sternberg e5657ca62b
feat(query): implement the read tag keys rpc call in the query engine (#13513)
If a pattern is seen that matches reading the tag keys, it will be
replaced with a direct RPC call to read the tag keys which should be
better optimized than reading from the storage engine tsm1 files.
2019-04-22 14:09:44 -05:00
Jacob Marble 662cf578c9
fix(storage): Assume sorted StringIterators, and retain sorted order (#13491)
* fix(storage): Assume sorted StringIterators, and retain sorted order

* embed struct field

* improve MergedStringIterator efficiency
2019-04-19 01:22:49 -07:00
Stuart Carnie 7a3b097197
fix: Add API compatibility assertion 2019-04-18 16:19:19 -07:00
Stuart Carnie 972cda1775
feedback: Changes in response to PR feedback 2019-04-18 16:19:18 -07:00
Stuart Carnie 7fc9661b7b
chore: Move StringIterator to cursors package for wider reuse 2019-04-18 16:19:17 -07:00
Todd Persen cd64ec8718 Fix typos in miscellaneous packages 2019-04-17 13:30:22 -07:00
Jacob Marble ff82e844ae
feat(storage): Add StringIteratorWriter.WrittenN() (#13456) 2019-04-17 11:05:45 -07:00
Jacob Marble 53810fadeb
feat(storage): Implement storage schema RPC de/serializer, merge, APIs (#13409)
* Extend storage service protobuf with TagKeys and TagValues

Co-authored-by: Michael Desa <mjdesa@gmail.com>
Co-authored-by: Jacob Marble <jacobmarble@influxdata.com>

* Extend the reads.Store interface with new TagKeys and TagValues APIs

* Extend readservice.store to implement refactored reads.Store interface

* Implement a StringIterator gRPC writer / serializer

* Implement a StringIterator gRPC reader / deserializer

* Implement a StringIterator merger
2019-04-16 16:01:05 -07:00
Stuart Carnie 5a224c74ea
feat(storage): Stub storage schema APIs
Closes #13241
2019-04-12 12:09:46 -07:00
Jacob Marble f56c42794b
chore(tracing): Cleanup (#13296)
* chore(tracing): Cleanup

* broken test

* fix unused var

* fix test
2019-04-10 19:28:21 -07:00
kelwang be674622c6
Revert "fix(inmem): remove the old inmem implementation" 2019-04-09 14:24:40 -04:00
kelwang d0022dfd5c
Merge pull request #13039 from zhulongcheng/rm-inmem-impl
fix(inmem): remove the old inmem implementation
2019-04-09 13:06:50 -04:00
jlapacik 0cde401678 refactor(storage): update GetSource method of Store interface 2019-04-08 15:59:37 -07:00
jlapacik 8078b915fd refactor(storage): ReadFilter storage operation 2019-04-08 15:59:37 -07:00
zhulongcheng cacd6a8223 fix(inmem): replace inmem.Service with kv.Service 2019-04-08 15:18:38 +08:00
Ben Johnson 307bb6af9c
Improve bulk series file writes. 2019-04-05 14:38:58 -06:00
Jeff Wendling 5d594deb2d
Merge pull request #13094 from influxdata/jmw-protobuf-bump
chore: bump gogo/protobuf and regenerate
2019-04-04 14:35:38 -06:00
jlapacik 2909927ab0
fix(reads): HasFieldValueKey searches for "$", not "_value" (#13138)
Storage converts references to _value in filter predicates to $.
It then considers any predicate that does not reference $ to be
a tag predicate. Tag predicates are used to construct series index
cursors.

This commit fixes a bug where field comparisons were being included
in tag predicates due to the HasFieldValueKey function searching
for comparison expressions referencing _value instead of $. Because
all references to _value had already been replaced. An expression
of the form '$ < 3000' would be considered a tag expression and
therefore would be mistakenly included as a tag predicate.

Fixes #13159.
2019-04-04 11:13:09 -07:00
Jeff Wendling 5dc3e52fd9 chore: bump gogo/protobuf and regenerate
It had been bumped previously from v1.1.1 to v1.2.0 and nothing was
regenerated. This bumps it to v1.2.1 and regenerates.
2019-04-03 15:42:33 -06:00
Jeff Wendling 157c4fcf0c storage: add tests for retention metrics 2019-04-01 17:44:37 -06:00
Jeff Wendling 3e370b0f99 storage: make retention metrics global
In the case that there are multiple engines, we want the retention
metrics from all of them.
2019-04-01 17:44:37 -06:00
Jacob Marble 0b626beb53
chore(storage): Don't log every read request (#12881)
* don't log every read request

* check for err
2019-03-25 13:14:45 -07:00
Stuart Carnie 8abb76cb4e
Merge pull request #12710 from influxdata/sgc/data-gen
Add data generation subcommand to influxd
2019-03-20 11:08:49 -07:00
Lorenzo Affetti 7dbecfa256
fix(storage/reads): release arrow buffers when advancing on table (#12737) 2019-03-20 15:48:59 +01:00
Stuart Carnie fe8b63c10b
chore(gen): Update the gen package to produce 2.0 series keys 2019-03-19 20:35:02 -07:00
Stuart Carnie 07aef958f8
fix(reads): Interface to identify a remote streamed cursor
A cursor can be in process (local) or remote. Remote cursors have
already applied the `hasPoints` check, to reduce network traffic.
Testing whether the cursor has points again here:

https://github.com/influxdata/influxdb/blob/master/storage/reads/reader.go#L221

will always return `false` for a remote cursor that has applied the
NoPoints optimization.

This temporary fix allows the `hasPoints` function to differentiate
a streamCursor and always return true in that case.
2019-03-11 18:40:35 -07:00
Jacob Marble 603a1f26e0 use tracing.StartSpanFromContext 2019-03-07 12:12:31 -07:00
Jacob Marble 9541e861a3 goimports -w -local github.com/influxdata/influxdb 2019-03-07 12:12:31 -07:00
Jacob Marble 92fa813c45 add spans to multiple services 2019-03-07 12:12:31 -07:00
Christopher M. Wolff e28ecdc0e9
refactor(query): make queryd present ProxyQueryService (#12360)
Fixes influxdata/idpe#2014.
2019-03-07 07:32:13 -08:00
Edd Robinson 582ed6834c ddress PR feedback 2019-03-07 09:56:07 +00:00
Edd Robinson 098ec71919 Remove erroneous print 2019-03-07 09:56:07 +00:00
Edd Robinson 8bdf857ddb Fix expected flux cases 2019-03-07 09:56:07 +00:00
Edd Robinson 3f1bec0836 Update emitted keys and tests 2019-03-07 09:56:07 +00:00
Edd Robinson f21be142d1 Storage engine now validates all tags are utf-8
The storage engine will now drop any points that contain invalid tag
data. Special tag keys for the measurement and field key will be
excepted from this validation.
2019-03-07 09:56:07 +00:00
Edd Robinson 9949fef2a5 Only validate tag pairs are valid unicode
We will want to validate that all tag key/value data is valid unicode.
This commit changes the validation helper to only validate provided
tags, since measurements are currently very likely to contain invalid
utf-8 characters.

There are two exceptions to the tag validation: the validation of the
special tag keys for measurements and field keys.
2019-03-07 09:56:07 +00:00
Edd Robinson f029f1645d Change location and value for internal tag keys 2019-03-07 09:56:07 +00:00
Jonathan A. Sternberg 9549cc4d97
fix(storage/reads): track memory allocations when reading from storage (#12404) 2019-03-06 15:29:45 -06:00
Jeff Wendling f53f9cd949 storage: detect conflicting types in a single batch of points
When the WAL was moved up, the validation that happened at the cache
was skipped. This moves the field type validation for a batch of
points up ahead of the WAL again.
2019-03-06 10:30:52 -07:00
Stuart Carnie da24fe6c7f feedback(reads): Clear r.buf before read 2019-03-05 11:45:51 -07:00
Stuart Carnie acc94985e3 fix(reads): Add retry when fetching data from StreamReader
It is possible a StreamReader (gRPC) may return an empty response. This
change adds retry and bail-out support. When a bail-out occurs,
reads.ErrStreamNoData is returned.
2019-03-05 11:45:51 -07:00
Stuart Carnie 28976e9b92 feat: Allow StreamReader to provide cursors.CursorStats
StorageReadClient adapts a gRPC Storage_ReadClient to provide
cursors.CursorStats by reading the trailer after receiving the final
message from the stream.
2019-03-05 11:45:51 -07:00
Stuart Carnie 5f241e5e5e fix(reads): Read Statistics before calling Close 2019-03-05 11:45:51 -07:00
Jacob Marble b9c7ec439e
feat(influxd): Tracing refactor (#12318)
* feat(launcher): Tracing to log disabled by default

* remove traceLogger and use opentracing directly

* add Jaeger tracing

* go vet && go fmt
2019-03-04 11:48:11 -08:00
Jeff Wendling 0fae44e219 storage: fix problems with keeping resources alive
This commit adds the pkg/lifecycle.Resource to help manage opening,
closing, and leasing out references to some resource. A resource
cannot be closed until all acquired references have been released.
If the debug_ref tag is enabled, all resource acquisitions keep
track of the stack trace that created them and have a finalizer
associated with them to print on stderr if they are leaked. It also
registers a handler on SIGUSR2 to dump all of the currently live
resources.

Having resources tracked in a uniform way with a data type allows us
to do more sophisticated tracking with the debug_ref tag, as well.
For example, we could panic the process if a resource cannot be
closed within a certain time frame, or attempt to figure out the
DAG of resource ownership dynamically.

This commit also fixes many issues around resources, correctness
during error scenarios, reporting of errors, idempotency of
close, tracking of memory for some data structures, resource leaks
in tests, and out of order dependency closes in tests.
2019-02-28 10:22:01 -07:00
Jonathan A. Sternberg 70507670c3
feat(storage/reads): add scanned values and bytes metadata to the query (#12156)
This updates influxdb to use the new arbitrary metadata that can be
attached by a source and adds metadata entries for the cursor
statistics.
2019-02-25 14:44:18 -06:00
Michael Desa 0d3d0d4d78
chore(influxdb): add context to storage.PointsWriter 2019-02-25 11:11:20 -05:00
Stuart Carnie 662887c679 feedback: Simplify reader; use constants from influxdb package 2019-02-21 11:18:08 -08:00
Stuart Carnie 01b5fccfbe feat(storage): Enforce single org for series key reads 2019-02-21 11:18:08 -08:00
Alirie Gray 5f524eb92d Rename all occurences of Macro to Variable 2019-02-14 13:21:57 -08:00
Jeff Wendling 3bb765279b storage: respond to review comments 2019-02-04 12:26:26 -07:00
Jeff Wendling 3014733b20 chore: fix staticcheck issues 2019-02-04 10:32:52 -07:00
Jeff Wendling 376b347d56 wal: change deletes to be based on DeleteBucket 2019-02-04 10:32:52 -07:00
Jeff Wendling 7f54e816e3 refactor: have retention use DeleteBucketRange 2019-02-04 10:32:52 -07:00
Jeff Wendling aa12144fc7 storage: replay the WAL through the whole engine 2019-02-04 10:32:52 -07:00
Jeff Wendling 6deced1215 refactor: make the WAL part of snapshots again 2019-02-04 10:32:52 -07:00
Jeff Wendling 2989936d5a refactor: write to the WAL again 2019-02-04 10:32:52 -07:00
Jeff Wendling 2f46937527 refactor: move value package up to tsdb 2019-02-04 10:32:52 -07:00
Jeff Wendling 8dbd98ccbe refactor: change the way the engine opens and closes and reload the cache
Open and Close now proceed as best as they are able to in the presence
of errors and clean up appropriately.
2019-02-04 10:32:52 -07:00
Jeff Wendling d2ddd48eea refactor: hook up metrics and wal to storage engine
It turns out that LastModified and DiskSize are unused, and so it
was easy to change to not care about the WAL.

This hooks up metrics and starts the WAL again.
2019-02-04 10:32:52 -07:00
Jeff Wendling 95de3d52b2 refactor: use concrete WAL in tsm1
At the cost of some nil checks, we don't have to have an interface, defend against
subtle bugs with nils in non-nil interfaces, an empty implementation, etc.

Also, the tsm1 engine is losing the WAL anyway.
2019-02-04 10:32:52 -07:00
Jeff Wendling c9bb55b889 refactor: move the tsm1/wal into the storage/wal package
Because the WAL relies on the tsm1.Value type, we move that into its own
tsm1/value package and set up some aliases forwarding them into tsm1. This
also required adding some methods and changing consumers to avoid the
unexported fields. I imagine this step will be useful one day when we make
the write path more efficient with respect to consuming points.

This commit additionally fixes some issues with generation. The iterator.tmpldata
and generation for array_cursor_* were removed accidentally when removing
iterators, making those generated files stale. Restore that and regenerate.

No change in functionality.
2019-02-04 10:32:52 -07:00
Nathaniel Cook 8372859dee Merge branch 'master' into flux-staging 2019-01-15 08:35:59 -07:00
Nathaniel Cook d0603457b7 refactor(flux): make packages mirror Flux namespaces 2019-01-14 18:00:45 -07:00
Edd Robinson b025d9afa9 Improve efficiency of TSI index series drop
This commit improves the performance of a mass delete on the TSI index
by deleting at the measurement level instead of deleting each series
individually.
2019-01-14 12:46:55 +00:00
Edd Robinson 9e11602b6a Add DeleteBucket benchmark 2019-01-14 12:46:55 +00:00
Nathaniel Cook 1708a41fa7 refactor: update query functions for Flux builtins 2019-01-11 13:11:57 -07:00
Jonathan A. Sternberg a59e6b8d25 refactor: rename DoArrow to Do (#2372)
See influxdata/flux#783 for details.
2019-01-10 10:30:25 -07:00
Mark Rushakoff d73d73c0d4 chore: rename imports from platform to influxdb
I did this with a dumb editor macro, so some comments changed too.

Also rename root package from platform to influxdb.

In interest of minimizing risk, anyone importing the root package has
now aliased it to "platform" so that no changes beyond imports were
necessary in those files.

Lastly, replace the old platform module to local path /dev/null so that
nobody can accidentally reintroduce a platform dependency while
migrating platform code to influxdb.
2019-01-09 20:51:47 -08:00
Jeff Wendling 703c3c15ca Hook up DeleteBucket to the tsm1 engine 2019-01-09 15:24:26 -07:00
Edd Robinson 42ff769f1c Wire up storage.Engine to HTTP BucketService 2019-01-09 15:09:56 +00:00
Edd Robinson e3ae256782 Add storage bucket service
The storage bucket service wraps another bucket service, and invokes
actions on a storage engine based upon the actions taken upon buckets.

Currently, the storage bucket service will delete bucket data from the
storage engine when the bucket is deleted via the bucket service.
2019-01-09 13:35:25 +00:00
Edd Robinson 5a12a3a72e Export access to bucket drop in engine 2019-01-09 13:35:07 +00:00
Nathaniel Cook 89f4525841 build(Makefile): fix various bug with makefiles
Fixes subdir ordering.
Works around issue with stringer not working with Go modules.
Fixes issues with generated code being ignored.

Fixes #2044
2018-12-19 17:02:19 -07:00
Nathaniel Cook d6c0a393b0 Merge branch 'master' into flux-staging 2018-12-19 11:30:55 -07:00