Commit Graph

458 Commits (85d75e3d4e865361a36ed704d03e4bea1d977a28)

Author SHA1 Message Date
Jonathan A. Sternberg 025319c387
fix(services/storage): multi measurement queries return all applicable series (#19566)
This fixes multi measurement queries that go through the storage service
to correctly pick up all series that apply with the filter. Previously,
negative queries such as `!=`, `!~`, and predicates attempting to match
empty tags did not work correctly with the storage service when multiple
measurements or `OR` conditions were included.

This was because these predicates would be categorized as "multiple
measurements" and then it would attempt to use the field keys iterator
to find the fields for each measurement. The meta queries for these did
not correctly account for negative equality operators or empty tags when
finding appropriate measurements and those could not be changed because
it would cause a breaking change to influxql too.

This modifies the storage service to use new methods that correctly
account for the above situations rather than the field keys iterator.

Some queries that appeared to be single measurement queries also get
considered as multiple measurement queries. Any query with an `OR`
condition will be considered a multiple measurement query.

This bug did not apply to single measurement queries where one
measurement was selected and all of the logical operators were `AND`
values. This is because it used a different code path that correctly
handled these situations.
2020-09-17 14:28:24 -05:00
Stuart Carnie 8753a7fd08 chore: Fix invalid string casts from integers
Newer Go versions generate a compile time error
2020-09-16 11:55:20 -07:00
Ayan George ca2055c16c
refactor: Replace ctx.Done() with ctx.Err() (#19546)
* refactor: Replace ctx.Done() with ctx.Err()

Prior to this commit we checked for context cancellation with a select
block and context.Context.Done() without multiplexing over any other
channel like:

  select {
    case <-ctx.Done():
      // handle cancellation
    default:
      // fallthrough
  }

This commit replaces those type of blocks with a simple check of
ctx.Err().  This has the following benefits:

* Calling ctx.Err() is much faster than entering a select block.

* ctx.Done() allocates a channel when called for the first time.

* Testing the result of ctx.Err() is a reliable way of determininging if
  a context.Context value has been canceled.

* fix: Fix data race in execDeleteTagValueEntry()
2020-09-16 12:20:09 -04:00
Stuart Carnie 04cff2a8d2 fix: Unit test 2020-09-04 15:56:57 -07:00
Stuart Carnie a24edb2b1c
chore: Skip tests on circleci
This is derived from 2fd8264 and 4f850b5, which skips tests on appveyor
2020-08-31 12:14:27 -07:00
Stuart Carnie f2205b37aa
chore: Skip TSI cardinality tests on circleci
This is derived from 793635d, which skips tests on appveyor
2020-08-31 12:11:04 -07:00
Stuart Carnie b1b6c1047a
chore: remove t.Parallel() in an attempt to make CircleCI happy 2020-08-31 10:45:23 -07:00
Brett Buddin b917d8d9b0
chore(influxdb): Placate the linter. 2020-08-27 15:46:32 -04:00
Stuart Carnie dee8977d2c
chore: move v2/v1/tsdb → v2/tsdb 2020-08-26 10:46:47 -07:00
Edd Robinson 2b175291be
refactor: WIP removing tsbd 2020-08-03 09:18:34 -07:00
Stuart Carnie e3060c291c
refactor: tsdb store builds an runs 2020-08-03 09:18:32 -07:00
Stuart Carnie 92efddbfbe
chore(tsdb): Initial commit of tsdb package
* pulls in 1.x tsdb, compiles and passes test
2020-08-03 09:17:23 -07:00
Ben Johnson 14a82ee65d fix(tsdb): Fix mincore wait() out of bounds calls 2020-07-27 11:48:39 -06:00
Ben Johnson 3cc2638bbf feat(tsi1): Add optional mincore limiter to TSI 2020-07-22 10:17:42 -06:00
Gavin Cabbage 3c6b728702
chore: use go generate to download large tsdb testdata (#18993)
* chore: use go generate to download large tsdb testdata

* chore(gitignore): TSM/TSI verbiage
2020-07-22 11:29:22 -04:00
Ben Johnson c476da2153
Merge pull request #18982 from influxdata/mincore-limiter
feat(mincore): Add page fault limiter
2020-07-17 12:22:54 -06:00
Ben Johnson c28eb70856 feat(mincore): Add page fault limiter
This commit adds `mincore.Limiter` which throttles page faults caused
by mmap() data. It works by periodically calling `mincore()` to determine
which pages are not resident in memory and using `rate.Limiter` to
throttle accessing using a token bucket algorithm.
2020-07-17 09:37:31 -06:00
Gavin Cabbage ef3ee96eea
chore: download tsi1 testdata with go generate (#18972)
* chore: remove tsi1 testdata and add go generate file to download

* chore: fix testdata url and rename gen file

* fix: add testdata generate command to Makefile

* chore: add testdata dir to gitignore

* refactor(tsdb): improve error message when missing testdata

* refactor(tsdb): tagged testdata and avoid stacktrace when missing
2020-07-17 11:31:29 -04:00
ricky dcf995922c test: set bigger max size of cache in TestConcurrentReadAfterWrite 2020-07-16 10:05:30 +08:00
ricky 9e82797a38 fix: missing data when reading after writing 2020-07-15 14:49:42 +08:00
Phil Bracikowski 25461dddcd
chore(testing): add missing defer to clean up test temp files (#18948) 2020-07-14 13:52:28 -07:00
Stuart Carnie 99bbbd3e4e
fix(storage): Reduce the check frequency
Checking a channel too regularly could cause
context switching to other goroutines. In tight loops,
it is prudent to check, but to do so less frequently so
as to avoid thrashing.
2020-07-09 18:44:00 -07:00
Brett Buddin 51406f4f62
feat(tsdb): SHOW TAG KEYS (no time) query using only TSI data. (#18905)
* feat(tsdb): SHOW TAG KEYS (no time) query using only TSI data.

* fix(tsdb): Allow for earlier return when scanning during show tag keys.

* fix(tsdb): Speed things up by using the key merger to reduce allocs.

* chore(tsm1): Fix golint.

* fix(tsdb): Remove sorting, because these keys should already be sorted.

* fix(tsdb): Remove dead code to placate the linter.
2020-07-09 18:01:42 -04:00
Ben Johnson be98fe3a81
Merge pull request #18901 from influxdata/tsm1-file-stat-created-at
feat(tsdb): Add CreatedAt field for tsm1.FileStat
2020-07-09 14:13:00 -06:00
jlapacik 49bdad8681 fix: descending array cursor should include end time
Fixes https://github.com/influxdata/influxdb/issues/18897.
2020-07-09 12:22:25 -07:00
jlapacik e6e55038e8 test: descending array cursor should include end time 2020-07-09 12:22:25 -07:00
Stuart Carnie d2dd19b70e
feat(storage): InfluxQL schema APIs without time range
These changes introduce optimized schema APIs for InfluxQL that
utilize the time series index (TSI) exclusively for significant
performance gains.
2020-07-09 10:09:19 -07:00
Ben Johnson 3fe7c63a0a feat(tsdb): Add CreatedAt field for tsm1.FileStat
This commit adds a "created at" field to `tsm1.FileStat` which
uses the `ModTime()` of the TSM file but excludes any updates
for tombstone files.
2020-07-09 10:38:59 -06:00
Gavin Cabbage 34ebc852c0
fix(tsm1): delimit tsmKeyPrefix with appended comma (#18785)
* fix(tsm1): delimit tsmKeyPrefix with appended comma

Fixes #7589.

Append a comma to the TSM key prefix when matching a full measurement name to avoid erroneously matching other measurement names that include the prefix in their own name. For example, this prevents matching a measurement "cpu1" when targeting "cpu" by updating the prefix to "cpu,". This relies on the fact that tag key-value pairs are separated by commas.

* fix(tsm1): regression tests for tsmKeyPrefix comma delimiting
2020-07-01 12:24:54 -04:00
Brett Buddin 0c268e205b
fix(storage): Push-down a predicate to match tags for SHOW MEASUREMENT calls (#18740)
* fix(storage): Push-down a predicate to match tags for SHOW MEASUREMENTS calls.

* chore: Address feedback.

* fix(tsm1): Split behavior based on existence of predicate for show measurements.

* fix(tsm1): Allow parenthesis expression on the LHS of a predicate.

* fix(tsm1): Create a separate tag predicate verifier that rejects negative comparisons.

* fix(tsm1): Additional test cases for show measurements with predicate.
2020-06-29 14:31:54 -04:00
Jonathan A. Sternberg 5aeca082c8
chore: update staticcheck and fix newly identified lint checks (#18737) 2020-06-26 18:54:09 -05:00
Ben Johnson 171f6586a0 fix(tsdb): Add refs for file-sourced tag keys
This commit adds ref counting for files that we pull tag keys from.
Previously, files were only ref counted during the time we extracted
tag keys but this commit adds additional ref counting for the life of
the `Engine.tagKeysNoPredicate()` function.
2020-06-17 10:27:23 -06:00
Ben Johnson 69fe9ed1ba
Merge pull request #17769 from patriczek/iss17257
fix: Migrated bucket should have correct retention policy.
2020-04-20 13:40:15 -06:00
Patrik Helia 07c89c9188 Fix fmt and reduce code
Signed-off-by: Patrik Helia <patrik.helia@kiwi.com>
2020-04-20 21:25:38 +02:00
Stuart Carnie c76f30682c
fix(storage): Feedback in response to PR review
* Adds clarifying documentation
* Regenerate protocol buffers with updated documentation
2020-04-16 15:19:28 -07:00
Stuart Carnie 6325591deb
feat(storage): New data types for measurement schema gRPC APIs
This commit

* adds new request and response data types for schema gRPC calls
* adds fmt.Stringer implementation to cursors.FieldType
* adds APIs to sort a slice of MeasurementField values,
* upgrades the gogo protobuf package to v1.3.1, which
  includes improvements to serialization.
2020-04-16 14:51:31 -07:00
Stuart Carnie 69820c08a4
feat(tsdb): Add maximum timestamp to MeasurementField
This is require in order to correctly merge results from multiple
sources.
2020-04-16 14:51:30 -07:00
Patrik Helia 7ce7e62f60 fix: Migrated bucket should have correct retention policy.
Signed-off-by: Patrik Helia <patashelia@gmail.com>
2020-04-16 21:35:48 +02:00
Stuart Carnie 21e339a32f
chore(storage): Fix documentation to reflect correct time interval 2020-04-14 11:04:56 -07:00
Stuart Carnie fe0ed6cb7e
feat(storage): Provide public MeasurementFields API 2020-04-14 10:49:16 -07:00
Stuart Carnie cb618efc65
feat(tsm1): Implementation of MeasurementFields
This commit provides an implementation of the MeasurementFields
API per the design previously outlined.
2020-04-08 16:15:34 -07:00
Stuart Carnie 7de6383adf
refactor(tsm1): Allow race-free access to cache
This commit adds a new API to `Cache` to address data races
with the `TagKeys` and `TagValues` APIs.

`Cache` and `entry` provide `AppendTimestamps`, which
appends the current timestamps to the provided slice
to reduce allocations. As noted in the documentation,
it is the responsibility of the caller to sort and deduplicate
the values, if required.

The `cursors.TimestampArray` type was extended to permit
use of the `sort.Sort` API.
2020-04-08 16:15:05 -07:00
Stuart Carnie 31df76e1e9
refactor(tsm1): Add TimeRangeMaxTimeIterator
This commit introduces a new API for finding the maximum
timestamp of a series when iterating over the keys in a
set of TSM files.

This API will be used to determine the field type of a single
field key by selecting the series with the maximum timestamp.

It has also refactored the common functionality for iterating
TSM keys into `timeRangeBlockReader`, which is shared
between `TimeRangeIterator` and `TimeRangeMaxTimeIterator`.
2020-04-08 16:05:19 -07:00
Jonathan A. Sternberg 6e4cf7ffef
refactor: fix imports from go template files (#17615) 2020-04-03 17:40:36 -05:00
Jonathan A. Sternberg 0ae8bebd75
refactor: rewrite imports to include the /v2 suffix for version 2 2020-04-03 12:39:20 -05:00
Stuart Carnie 069820ba4b
fix(models): Added error return value; use iota; fix spelling 2020-04-02 08:34:22 -07:00
Stuart Carnie d424d7d1f5
feat(tsdb): Add new measurement based schema APIs
These APIs require a measurement, permitting an additional optimization
to reduce the search space against the TSM index. Specifically, the
search key prefix is extended from `org+bucket` to
`org+bucket,\x00=<measurement>`

* MeasurementNames
* MeasurementTagKeys
* MeasurementTagValues
* Adds an api to the models package for efficiently parsing the
  measurement tag (\x00) from a normalized series key
2020-04-02 08:33:58 -07:00
Stuart Carnie 37a97437e7
fix: Invariant violated: mixed block types for a single series
The root cause is that the Unsigned data type has no representation
in the valueType function in the cache and falls back to the default
case of 0.

0 is also a sentinel value in the entry#add function that will
result in skipping the value type check.

It therefore is possible that unsigned values followed by some other
data type is stored in the cache.

It is suspected that the write may be rejected before reaching the
cache, and therefore may not occur in practice. Specifically, the
series file stores the data types on a per-series basis and would
reject the write.

This commit turns the value types into explicit constants and
ensures all existing block types are represented. In addition,
it adds a mapping function to convert these to a known Block type,
which will be used by the `MeasurementFields` schema request to
determine the type of a series in the cache.
2020-04-01 18:42:22 -07:00
Ben Johnson 7d72b4e511 feat(tsdb): Bulk delete series performance improvement 2020-03-18 15:47:35 -06:00
Edd Robinson d96cbd4f74
Merge pull request #17016 from influxdata/er-bulk-import
feat(storage): prototype 1.x–2.x migration tooling
2020-03-18 17:57:26 +00:00