Commit Graph

284 Commits (d720661e77564105bfe371a392deff165ce49562)

Author SHA1 Message Date
Chris Goller 7de2cafb13 feat(storage/readservice): define engine interface
We added an interface for the *storage.Engine to make it easier
to add end-to-end tests.

Co-authored-by: Bucky Schwarz <d.w.schwarz@gmail.com>
2019-11-20 15:54:32 -06:00
Edd Robinson 8f6701d4b1 feat(storage): add full compaction semaphore
By default this feature is disabled; the full compaction behaviour does
not change. When this feature is enabled compactions can be limited
across multiple storage engines running in multiple processes.

The mechanism by which this happens is not part of the abstraction added
here.
2019-10-23 19:45:01 +01:00
Brandon Farmer ea82dc3470 fix(tasks): tasks look up system bucket id 2019-10-21 14:48:47 -07:00
Brandon Farmer 2e0749b3ba feat(influxdb): Add system buckets on org creation
* Only allow users to create user buckets
* Only accept bucket creation parameters on post
2019-10-21 14:48:47 -07:00
Kelvin Wang 62f4042853 feat(influxdb): add predicate package 2019-10-18 12:02:52 -04:00
Edd Robinson 179c57ab2e feat(storage): allow compaction limiter to be injected 2019-10-04 12:35:21 -07:00
elbehery c0b87c657c fix(storage): remove level=0 from TSM disk bytes metrics. 2019-09-25 15:57:25 +02:00
Brandon Farmer d83fabeabc feat(influxdb): user disabling 2019-09-23 11:57:16 -07:00
Edd Robinson db72f57da4 feat(storage): inject function to control when retention enforcer runs ()
* test(storage): ensure multiple engines can run concurrently

* feat(storage): expose control over retention run

Fixes .

This commit adds the ability to inject a functional option into a
storage.Engine for controlling when the retention enforcer can run.

Previously, retention enforcers ran on an interval; if you ran multiple
storage engines (as we do in some environments) then it was not possible
to coordinate when engines ran retention. Often they would synchronise
because they started at the same time.

This change will let you specify a blocking function to control when the
retention enforcer can run.

A simple function for serialising retention enforcement across multiple
storage engines could look like:

```go
var mu sync.Mutex
func f() (done func()) {
    mu.Lock()
    return func() { mu.Unlock() }
}
```
2019-09-23 08:09:04 -07:00
Lorenzo Affetti 053836e5a5
Merge pull request from influxdata/flux-staging-v0.48.x
build(flux): update to Flux v0.48.0
2019-09-20 18:24:02 +02:00
Lorenzo Affetti ab835c8e0e
refactor(dependencies): use new dependency injection framework ()
refactor(dependencies): use new dependency injection framework
2019-09-19 17:01:17 +02:00
Edd Robinson e2f5b2bd9d refactor(storage): add more context to traces and logs 2019-09-19 13:48:06 +01:00
Ben Johnson 9237ee6a40
fix(tsi1): Remove TSI cardinality stats cache 2019-09-04 14:48:22 -06:00
George 8109d161bb
perf(storage): expose ability to peek on stream readers () 2019-09-04 13:57:36 +00:00
Nathaniel Cook dfc28335ea refactor(query/dependencies): update to new Flux dependencies defaults 2019-08-26 16:46:17 -06:00
Adam 945b68b8fd fix(query): finish refactoring the repl and inject the secret service as a dependency 2019-08-26 16:46:17 -06:00
Nathaniel Cook 6303e2dcc5 test(query): skip holt_winters_panic test
added executor dependencies where needed
2019-08-26 16:46:17 -06:00
Adam Perlin 76dbc44e3c
feat(storage): Add influxd inspect dumpwal tool ()
* feat(storage/wal/dump): initial influxd inspect dumptsmwal implementation

* feat(storage/wal/dump): add org bucket formatting to dumpwal tool; improve test cases

* refactor(storage/wal/dump): add long description for dumpstmwal tool

* refactor(storage/wal/dump): rename dumptsmwal flag

* chore(storage/wal/dump): gofmt

* refactor(storage/wal/dump): update error printing in dumptsmwal tool

* refactor(storage/wal/dump): address review comments

* refactor(storage/wal/dump): rename dumpwal command source file

* refactor(storage/wal/dump): clarify print flag comment

* refactor(inspect): remote unnecessary for-loop in influxd inspect command
2019-08-23 13:05:06 -07:00
Jacob Marble 851279b71f
chore(storage): bring back storage_retention_checks_total () 2019-08-22 10:47:27 -07:00
Edd Robinson d160585a34 refactor(storage): add deeper tracing around deletes 2019-08-22 11:08:33 +01:00
Jacob Marble 26d29f7aa5
chore(storage): remove metric storage_retention_checks_total () 2019-08-20 14:39:08 -07:00
Stuart Carnie f60c2ec3ba
fix(reads): Remove issue reference from test per feedbakc 2019-08-16 13:00:06 -07:00
Stuart Carnie 3ca751cfd6
fix(reads): ResponseWriter truncates values for last series
The ResponseWriter would truncate the last series if the byte size of
the points frames exceeded the writeSize constant, causing a Flush to
occur and the cumulative ResponseWriter.sz to reset to zero. Because
ResponseWriter.sz was not incremented for each frame, it remained at
zero, which resulted in the final Flush short circuiting.

This commit implements the Size method for the cursors.Array types
to be used to estimate the size of frame. This is in place of calling
the Protocol Buffer `Size` function, which can be very expensive.
2019-08-16 10:36:40 -07:00
Stuart Carnie 0b20c227b4
feat(reads): A series of helpers to produce a SeriesCursor
This allows the data/gen package to be used to produce a SeriesCursor
for generated data that can be used in testing by the reads package.
2019-08-16 10:36:30 -07:00
Jonathan A. Sternberg 3d747b4fb1
fix(storage/reads): remove duplicate tables from the stream ()
If the reader produces more than one table with the same group key, we
discard the later ones because the stream should never give us more than
one table with the same group key.

This is an error and it indicates the server sent us a bad set of data.
This change makes it so that the client is tolerant of that data and
will discard it if it exists.
2019-08-15 10:20:35 -05:00
Edd Robinson 5aead27e8b refactor(storage): remove commented code 2019-08-12 13:49:26 +01:00
j. Emrys Landivar (docmerlin) 7bd481d829 respond to pr comments 2019-08-05 13:16:51 -05:00
j. Emrys Landivar (docmerlin) 24c1f21e4e WIP 2019-08-05 13:16:51 -05:00
Christopher M. Wolff 42bb664aaf
feat(query): add storage request duration metric ()
2019-08-02 08:53:14 -07:00
tmgordeeva 48ee7ada04
fix(storage): move retention snapshot out of per bucket calls ()
* fix(storage): move retention snapshot out of per bucket calls

Also adds tracking for snapshots from retention and full compactions.
2019-07-23 11:40:05 -07:00
tmgordeeva 871f5466fe
fix(storage): run snapshot before retention deletes ()
Deleting from the cache takes a lock which blocks writes. Snapshot to clear the
cache before deleting to reduce the lock contention.
2019-07-22 16:22:42 -07:00
Edd Robinson abbe795fa5 docs(storage): update PB doc to reflect new domain 2019-07-05 17:10:56 +01:00
Jonathan A. Sternberg 8cf3453d5c
fix(storage/reads): storage table implementation passes table tests () 2019-07-03 09:26:08 -05:00
Adam Perlin 24baec9e6d Gofmt verify-wal files 2019-06-27 16:28:28 -07:00
Adam Perlin fba4326c72 feat(storage): remove unnecessary lines from verify-wal test 2019-06-27 16:28:28 -07:00
Adam Perlin f4faa9b2f5 feat(storage): Small verify-wal output and test tweaks 2019-06-27 16:28:28 -07:00
Adam Perlin c868ece4f6 feat(storage): Initial 2.x verify-wal tool functionality 2019-06-27 16:28:28 -07:00
Tanya Gordeeva 6428cdbce6 fix(storage): initialize tsm file metrics, update after compaction
These metrics weren't being properly intialized on opening the file store, and
weren't being properly updated on compaction.
2019-06-20 14:37:53 -07:00
Ben Johnson 14980d55b8
fix(storage): Add WithCurrentGenerationFunc() for generation injection.
Adds the ability to set the current generation to use when compacting
the cache only. Previously, we used the current generation for all
files but this causes issues and we should only use the current
generation for level 1 compaction.
2019-06-20 08:54:38 -06:00
Jonathan A. Sternberg eeb32beb49
fix(storage/reads): ensure that the column reader gets its length set ()
When a buffered column reader was used, the length was not reset to
whatever the requested length was for the buffer so it was possible for
the length to be longer than the actual columns.
2019-06-05 15:09:37 -05:00
Jonathan A. Sternberg 2b1c1ec143
fix(storage/reads): fix the storage tables to work correctly with multiple transformations ()
The storage table reader will now work correctly when there are multiple
outputs. The table interface now implements the new table and column
reader interfaces and works properly with `execute.CopyTable`. The
source uses `execute.CopyTable` to buffer the table in memory when there
are multiple output transformations.
2019-05-30 12:31:54 -05:00
Mark Rushakoff 4b3d57c06d fix(storage): add missing RUnlock in Engine.Close
I don't see anywhere obvious that an engine would be closed twice, but
if it was, the RLock would have been held permanently, such that a Lock
could not be taken later.

Running go test ./storage/... did not trigger a double-close.
2019-05-29 08:40:40 -07:00
Jonathan A. Sternberg 21c80f3e93
refactor(query/control): move the controller from flux to influxdb ()
The controller implementation is primarily used by influxdb so it
shouldn't be part of the flux repository. This copies the code from flux
to influxdb so it can be removed from the next flux release.
2019-05-29 09:04:34 -05:00
Jonathan A. Sternberg ebdbc394fc
chore(flux): update to Flux v0.31.0 ()
* refactor(storage/reads): update the table implementation for the interface change ()

* chore(flux): update to Flux v0.31.0
2019-05-28 17:24:26 -05:00
Jonathan A. Sternberg c98a40db14
fix(storage/reads): stop copying the values to an unnecessary buffer in the storage reader ()
The copy was unnecessary since it was just going to be copied
immediately afterwards into an Arrow buffer. In the future, we will want
to have storage directly send the arrow buffer, but right now we are
putting it in an array and copying it anyway.

Even when we send an arrow buffer, the underlying sequence of bytes is
probably going to be different and we will rely on the allocator to
reuse bytes so let's remove the extra copy.
2019-05-15 20:40:29 -05:00
Christopher Wolff 90a5d88fc5 fix(query): skip failing end to end tests 2019-05-14 12:52:37 -07:00
jlapacik faab75968b refactor(storage): remove Read method from Store interface 2019-05-03 11:02:20 -07:00
Jeff Wendling ef0768db31
tsm1: predicate deletes ()
tsm1: predicate deletes
2019-05-03 14:27:25 +00:00
Lorenzo Affetti 26d327ef9d
Merge pull request from influxdata/fix/read-filter
fix(readservice): normalize special tag keys after reducing request p…
2019-05-02 20:13:32 +02:00
Stuart Carnie bf774b66ce
fix(storage): Ensure Tag(Keys|Values) APIs never return (nil, nil)
Formalized this post condition in the documentation and added additional
unit tests.

Added a nil guard and unit test to WriteStringIterator.
2019-05-02 09:45:38 -07:00
Lorenzo Affetti 0993a9f15b fix(readservice): normalize special tag keys after reducing request predicate 2019-05-02 16:55:08 +02:00
Stuart Carnie d858bd6f77
fix(storage): Sort keys were incorrectly sorted when concatenated
This manifested as incorrect sort ordering when serialized via RPC,
resulting in an `invalid partition key order` error.

This fix introduces a delimiter to ensure sort keys cannot collide.
2019-05-01 13:37:28 -07:00
Jeff Wendling 16e9eb4cb9 tsdb: respond to feedback and improve test coverage
predicate.go:
	UnmarshalPredicate       100.0%
	NewProtobufPredicate     100.0%
	Matches                  100.0%
	Marshal                  100.0%
	walkPredicateNodes       100.0%
	buildPredicateNode       100.0%
	newPredicateState        100.0%
	Reset                    100.0%
	Set                      100.0%
	newPredicateCache        100.0%
	Cached                   100.0%
	Store                    100.0%
	Update                   100.0%
	Update                   92.9%
	Update                   94.1%
	predicateEval            90.9%
	predicatePopTag          100.0%
	predicatePopTagEscape    100.0%
2019-05-01 13:40:40 -06:00
Jeff Wendling 4b4a814d7d storage: fix predicate matching on field tags 2019-05-01 13:40:40 -06:00
Jeff Wendling e84d4625a5 storage: add predicate deletes to the engine interface 2019-05-01 13:40:40 -06:00
Jeff Wendling e10939b8af storage: add predicate tracking to the WAL 2019-05-01 13:40:40 -06:00
Jeff Wendling 7403fd8aa9 tsm1: rename engine method to DeletePrefixRange
The storage/engine knows about buckets, but the tsm1/engine doesn't, so
name the tsm1/engine method Prefix and keep the storage/engine named
Bucket.
2019-05-01 13:40:40 -06:00
jlapacik 5d90683b07 refactor(storage): remove no points tables and streamReader interface
These tables were previously used to perform meta queries.
Meta queries are now answered using a specific API, and as
a result, these tables can go away.
2019-05-01 10:35:10 -07:00
jlapacik 95aa194498 feat(storage): ReadGroup RPC definitions and storage reader 2019-05-01 10:35:10 -07:00
Stuart Carnie f56b4ef020
Merge pull request from influxdata/sgc/fix/merge
Ensure GroupCursor Keys is union of keys from all GroupCursors of current partition key
2019-05-01 09:07:53 -07:00
Jacob Marble 8c269e0153
chore(log): Put trace_id back in logs ()
* chore(log): Put trace_id back in logs

* fix tests
2019-04-30 18:51:22 -07:00
Stuart Carnie 96c2282aab
fix(query): Keys must be union of all keys from all GroupCursors 2019-04-30 15:49:36 -07:00
Jonathan A. Sternberg e181edd077
fix(storage/reads): translate measurement and field tag key names ()
Translate the measurement and field tag key names to their non-storage
names and add the `_start` and `_stop` tag keys to the output since
they aren't real tags, but ones that are added by range.
2019-04-29 18:11:20 -05:00
Jonathan A. Sternberg 96a76aad1d
fix(storage/reads): reserve data for the tags column when building a table () 2019-04-29 14:28:25 -05:00
Kelvin Wang ea54e2c2c8 fix(kv): fix empty org name 2019-04-26 18:16:28 -04:00
Stuart Carnie fb39ac39ce
fix(storage): Store.Read behavior changed to return unsorted series keys
Closes 
2019-04-26 10:38:59 -07:00
Jonathan A. Sternberg 46d2d0012b fix(storage): translate _measurement and _field to the proper strings ()
The RPC call should translate `_measurement` and `_field` to their
proper shortened byte strings when requesting the tag values.

This also fixes the planner rewrites to return the root node even when
no rewrite happened as this is required by the planner.
2019-04-26 10:36:51 -07:00
Stuart Carnie ed344d25f8
feat(storage): Teach storage how to find a distinct set of tag keys
The TagValues API will perform a linear scan if there is no predicate;
otherwise, it will use the index to find a list of candidate series
keys.

TagKeys expects the predicate to be transformed such that
`_measurement` and `_field` are remapped to `\x00` and `\xff`
respectively.

There is one TODO marked to analyze the predicate for a
`\x00 = '<measurement>'` pattern. If found, the predicate can be
eliminated and fall back to a linear prefix scan by combining the org,
bucket and measurement. This is tracked by issue .
2019-04-24 11:14:22 -07:00
Ben Johnson 01bfcf822b
Merge point parse & explode ()
Merge point parse & explode
2019-04-24 10:30:16 -06:00
Ben Johnson 272f340c30
Merge point parse & explode. 2019-04-24 10:12:15 -06:00
Jonathan A. Sternberg 5e77bd1e28
feat(query): implement the read tag values rpc call in the query engine ()
If a pattern is seen that matches the `v1.tagValues(...)` call, then it
will be replaced with a direct RPC call to read the tag values for the
selected tag key which should be better optimized than reading from the
storage engine tsm1 files.
2019-04-23 12:56:35 -05:00
kelwang 0916741838
Merge pull request from influxdata/bucket_scrapper_org_rename
Bucket scrapper org rename
2019-04-22 19:46:51 -04:00
Jacob Marble bc2650813d
chore(storage): Merge StringIterator cursor stats ()
* chore(storage): Merge StringIterator cursor stats

* add unit test

* properly count stats after iteration
2019-04-22 15:39:21 -07:00
Kelvin Wang 7a72c363f2 remove org from bucket 2019-04-22 18:39:05 -04:00
Jonathan A. Sternberg e5657ca62b
feat(query): implement the read tag keys rpc call in the query engine ()
If a pattern is seen that matches reading the tag keys, it will be
replaced with a direct RPC call to read the tag keys which should be
better optimized than reading from the storage engine tsm1 files.
2019-04-22 14:09:44 -05:00
Jacob Marble 662cf578c9
fix(storage): Assume sorted StringIterators, and retain sorted order ()
* fix(storage): Assume sorted StringIterators, and retain sorted order

* embed struct field

* improve MergedStringIterator efficiency
2019-04-19 01:22:49 -07:00
Stuart Carnie 7a3b097197
fix: Add API compatibility assertion 2019-04-18 16:19:19 -07:00
Stuart Carnie 972cda1775
feedback: Changes in response to PR feedback 2019-04-18 16:19:18 -07:00
Stuart Carnie 7fc9661b7b
chore: Move StringIterator to cursors package for wider reuse 2019-04-18 16:19:17 -07:00
Todd Persen cd64ec8718 Fix typos in miscellaneous packages 2019-04-17 13:30:22 -07:00
Jacob Marble ff82e844ae
feat(storage): Add StringIteratorWriter.WrittenN() () 2019-04-17 11:05:45 -07:00
Jacob Marble 53810fadeb
feat(storage): Implement storage schema RPC de/serializer, merge, APIs ()
* Extend storage service protobuf with TagKeys and TagValues

Co-authored-by: Michael Desa <mjdesa@gmail.com>
Co-authored-by: Jacob Marble <jacobmarble@influxdata.com>

* Extend the reads.Store interface with new TagKeys and TagValues APIs

* Extend readservice.store to implement refactored reads.Store interface

* Implement a StringIterator gRPC writer / serializer

* Implement a StringIterator gRPC reader / deserializer

* Implement a StringIterator merger
2019-04-16 16:01:05 -07:00
Stuart Carnie 5a224c74ea
feat(storage): Stub storage schema APIs
Closes 
2019-04-12 12:09:46 -07:00
Jacob Marble f56c42794b
chore(tracing): Cleanup ()
* chore(tracing): Cleanup

* broken test

* fix unused var

* fix test
2019-04-10 19:28:21 -07:00
kelwang be674622c6
Revert "fix(inmem): remove the old inmem implementation" 2019-04-09 14:24:40 -04:00
kelwang d0022dfd5c
Merge pull request from zhulongcheng/rm-inmem-impl
fix(inmem): remove the old inmem implementation
2019-04-09 13:06:50 -04:00
jlapacik 0cde401678 refactor(storage): update GetSource method of Store interface 2019-04-08 15:59:37 -07:00
jlapacik 8078b915fd refactor(storage): ReadFilter storage operation 2019-04-08 15:59:37 -07:00
zhulongcheng cacd6a8223 fix(inmem): replace inmem.Service with kv.Service 2019-04-08 15:18:38 +08:00
Ben Johnson 307bb6af9c
Improve bulk series file writes. 2019-04-05 14:38:58 -06:00
Jeff Wendling 5d594deb2d
Merge pull request from influxdata/jmw-protobuf-bump
chore: bump gogo/protobuf and regenerate
2019-04-04 14:35:38 -06:00
jlapacik 2909927ab0
fix(reads): HasFieldValueKey searches for "$", not "_value" ()
Storage converts references to _value in filter predicates to $.
It then considers any predicate that does not reference $ to be
a tag predicate. Tag predicates are used to construct series index
cursors.

This commit fixes a bug where field comparisons were being included
in tag predicates due to the HasFieldValueKey function searching
for comparison expressions referencing _value instead of $. Because
all references to _value had already been replaced. An expression
of the form '$ < 3000' would be considered a tag expression and
therefore would be mistakenly included as a tag predicate.

Fixes .
2019-04-04 11:13:09 -07:00
Jeff Wendling 5dc3e52fd9 chore: bump gogo/protobuf and regenerate
It had been bumped previously from v1.1.1 to v1.2.0 and nothing was
regenerated. This bumps it to v1.2.1 and regenerates.
2019-04-03 15:42:33 -06:00
Jeff Wendling 157c4fcf0c storage: add tests for retention metrics 2019-04-01 17:44:37 -06:00
Jeff Wendling 3e370b0f99 storage: make retention metrics global
In the case that there are multiple engines, we want the retention
metrics from all of them.
2019-04-01 17:44:37 -06:00
Jacob Marble 0b626beb53
chore(storage): Don't log every read request ()
* don't log every read request

* check for err
2019-03-25 13:14:45 -07:00
Stuart Carnie 8abb76cb4e
Merge pull request from influxdata/sgc/data-gen
Add data generation subcommand to influxd
2019-03-20 11:08:49 -07:00
Lorenzo Affetti 7dbecfa256
fix(storage/reads): release arrow buffers when advancing on table () 2019-03-20 15:48:59 +01:00
Stuart Carnie fe8b63c10b
chore(gen): Update the gen package to produce 2.0 series keys 2019-03-19 20:35:02 -07:00
Stuart Carnie 07aef958f8
fix(reads): Interface to identify a remote streamed cursor
A cursor can be in process (local) or remote. Remote cursors have
already applied the `hasPoints` check, to reduce network traffic.
Testing whether the cursor has points again here:

https://github.com/influxdata/influxdb/blob/master/storage/reads/reader.go#L221

will always return `false` for a remote cursor that has applied the
NoPoints optimization.

This temporary fix allows the `hasPoints` function to differentiate
a streamCursor and always return true in that case.
2019-03-11 18:40:35 -07:00