Commit Graph

2657 Commits (3f3b7b5160a882ba513943fa0a72908a21ea6c50)

Author SHA1 Message Date
Ben Johnson 4a1a8c0041
Merge pull request #18689 from influxdata/batch-write-tombstones-when-deleting
perf(tsi1): batch write tombstone entries when dropping/deleting
2020-06-25 08:15:12 -06:00
Ben Johnson be0edf5d75
Merge pull request #18695 from influxdata/epoch-wait-when-dropping-shard
fix(tsi1): wait deleting epoch before dropping shard
2020-06-25 08:14:53 -06:00
Ayan George a9d02e7ab7
fix: Handle snapshot related errors (#18710)
When applied this patch will:

* log snapshot directory removal errors

  Prior to this patch, errors when removing temporary snapshot
  directories happens silently.

  This patch ensures that errors are logged when os.RemoveAll() fails.

* refactor tsm1: Declare error value in condition

  Save a line of code and limits the scope of an error value.

* refactor tsm1: Add MakeSnapshotLinks()

  This commit adds (*FileStore).MakeSnapshotLinks().  The code in this
  function was originally part of CreateSnapshot().

  That code was hoisted out and into MakeSnapshotLinks() becuase there
  are two points of failure that require cleanup -- we have to delete a
  temporary directory on failure.

  Placing the code in one function allows us to check its returned error
  value and perform cleanup in only once place.

  In short, we hoisted code out of CreateSnapshot() to simplify error
  handling.

  On error, we remove any directories we created.
2020-06-25 10:05:04 -04:00
dengzhi.ldz 42dba6487a fix(tsi1): wait deleting epoch before dropping shard 2020-06-24 09:37:13 -06:00
dengzhi.ldz 331569bc11 perf(tsi1): batch write tombstone entries when dropping/deleting 2020-06-24 09:26:09 -06:00
Tristan Su 7be913de6e
chore(tsdb): clean up unused ShardID in EngineOptions (#17243) 2020-06-22 15:01:32 -07:00
Ben Johnson 51f647d763 fix(tsdb): Defer closing of underlying SeriesIDSetIterators
This commit changes the SeriesIDSet merge/union/intersect functions
to attach the underlying iterators as closers so that files can be
retained until the data is no longer in use. The roaring operations
can leave containers pointing at mmap data in the resulting bitmap
so we have to track underlying file usage until the data is finished
with.
2020-05-22 10:46:05 -06:00
Ben Johnson 9c41e12ee4 fix(tsdb): Disable series id set cache size by default.
This commit changes `DefaultSeriesIDSetCacheSize` to zero so that the
tag value cache is disabled by default. There is a rare known bug where
the cache can cause a segfault which crasheds the process. The cache
is being disabled instead of removed as some users may still need the
cache for performance reasons.
2020-05-19 10:04:05 -06:00
J. Emrys Landivar 6be2b37e0e
Merge pull request #18014 from foobar/cleanup-unused-functions
chore: clean up unused functions
2020-05-18 18:47:07 -05:00
Ayan George 3b9be0145c
fix: address static check warning s1039 (#18135)
This commit quiets staticcheck's warnings about "unnecessary use of
fmt.Sprintf" and "unnecessary use of fmt.Sprint".

Prior to this commit we were wrapping simple constant strings without
any formatting verbs with fmt.Sprintf().
2020-05-18 13:55:05 -04:00
Ayan George c0e391935d
fix(tsdb): address staticcheck warning SA4006 (#18127)
This commit addresses staticcheck warning SA4006 "this value of VAR is
never used" by removing unused variables
2020-05-18 13:11:44 -04:00
Ayan George 1a22af7172
fix(tsdb): address staticcheck warning st1006 (#18126)
* fix(tsdb): address staticcheck ST1006

This patch addresses staticcheck warning "receiver name should not be an
underscore, omit the name if it is unused (ST1006)" for 6 methods.
2020-05-17 22:36:06 -04:00
Ayan George 4cbc6a4269
fix(tsdb): Fix variables masked by a declaration (#18129)
Before this commit, the to and from variables were being re-declared in
a block in such a way that the values were not being used.

This patch uses regular assignment so that the values are visable
outside of the block where they're set.

Closes: 18128
2020-05-17 21:40:12 -04:00
Tristan Su d14acea44d chore: clean up unused functions 2020-05-08 13:45:34 +08:00
Ayan George a0f2e0c21a
fix(tsm1): Fix temp directory search bug (#17685)
* fix: verify precision parameter in write requests

This change updates the HTTP endpoints that service v1 and v2 writes to
verify the values passed in the precision parameter.

* fix(tsm1): Fix temp directory search bug

The original code's intention is to scan a directory for the directory
with the higest value when converted to an integer.

So directories may be in the form:

  0.tmp
  1.tmp
  2.tmp
  30.tmp
  ...
  100.tmp

The loop should scan the directory, strip the basename and extension
from the file name to leave just a number, then store the higest number
it finds.

Before this patch, there is a bug that has the code only store the
higest value if there is an error converting the numeric value into an
integer.

This patch primarily fixes that logic.

In addition, this patch will save an indent level by inverting logic in
two places:

Instead if checkig if a file is a directory and has a suffix of ".tmp",
it is probably better to test if a file is NOT a directory OR does NOT
have an extension of ".tmp" then continue.

Also, instead of testig if len(ss) == 2, we can test if len(ss) != 2 and
continue if so.

Both of these save an indent level and keeps our "happy path" to the
left.

Finally, this patch will use string concatination instead of calling
fmt.Sprintf() to add periods to "tmp" and "tsm" extension.

Co-authored-by: David Norton <dgnorton@gmail.com>
2020-04-15 10:29:46 -04:00
Ben Johnson 848ca48225 fix(tsdb): Revert "fix: remove some unsafe marshalling to reduce risk of segfault"
This reverts commit 30dab03310.
2020-04-13 13:22:22 -06:00
Ben Johnson 3f59d4be8e fix(tsdb): Fix error lists 2020-04-01 14:54:39 -06:00
elbehery 042128b948 fix(tsdb): Replace panic with error while de/encoding corrupt data
fixes #17440

While encoding or decoding corrupt data, the current behaviour is to `panic`.
This commit replaces the `panic` with `error` to be propagated up to the calling `iterator`.
To avoid overwriting other `error`, iterators now wraps a `TSMErrors` which contains ALL the encountered errors.
TSMErrors itself implements `Error()`, the returned string contains all the error msgs, separated by "," delimiter.
2020-04-01 20:51:11 +02:00
David Norton 25381f97c8
Merge pull request #15952 from influxdata/er-verify-tombstone
feat(inspect): add influx_inspect verify-tombstone tool
2020-03-11 15:56:37 -04:00
docmerlin (j. Emrys Landivar) 30dab03310 fix: remove some unsafe marshalling to reduce risk of segfault
We were seing segfaults in Roaring bitmaps sometimes, under very
high load with networked drives.  This may reduce risk of segfault by
forcing marshalling to copy the data.
2020-03-11 13:47:29 -04:00
Ayan George 5f47c388df
chore(influxdb): Forward port 16999 (#17032)
* fix: access tsi active log file with READ lock

The activeLogFile pointer may be altered by other routine so the READ
lock is needed.

* Merge pull request #16384 from foobar/tsi-partition-lock

fix: access tsi active log file with READ lock

Co-authored-by: Tristan Su <sooqing@gmail.com>
Co-authored-by: David Norton <dgnorton@gmail.com>
2020-03-04 16:20:58 -05:00
Edd Robinson c3f4382ed8
Merge pull request #16606 from influxdata/BP-1.8-er-tsm-block-fix
fix(storage): ensure all block data returned
2020-03-03 14:32:15 +00:00
Ben Johnson 7a9eb1420c fix(tsdb): Fix -compact-series-file flag 2020-02-06 13:40:19 -07:00
Gianluca Arbezzano a3e37f417d
Merge pull request #16627 from influxdata/feature/skip-wal-cache
chore(tsm1): skip WriteSnapshot during backup if snapshotter is busy
2020-02-04 21:44:48 +01:00
Gianluca Arbezzano 30621ca9ec chore(tsm1): skip WriteSnapshot during backup if snapshotter is busy
When an InfluxDB database is very busy writing new points the backup
the process can fail because it can not write a new snapshot.

The error is: `operation timed out with error: create snapshot: snapshot in progress`.

This happens because InfluxDB takes almost "continuously" a snapshot
from the cache caused by the high number of points ingested.

This PR skips snapshots if the `snapshotter` does not come available
after three attempts when a backup is requested.

The backup won't contain the data in the cache or WAL.

Signed-off-by: Gianluca Arbezzano <gianarb92@gmail.com>
2020-02-04 20:09:50 +01:00
Edd Robinson 34c0fdafc0 feat(storage): Offline series file compaction 2020-02-03 13:57:31 -07:00
David Norton 962cf6f4e7
Merge pull request #16595 from influxdata/fix/show-series-cardinality
fix(tsm1): improve series cardinality limit
2020-01-21 18:08:13 -05:00
David Norton 903d2c2d28 fix(tsm1): improve series cardinality limit
Prior to this change, new series would be added to the series file
before checking the series cardinality limit. If the limit was exceeded,
the write was rejected even though the series had already been added to
the series file.
2020-01-21 16:45:13 -05:00
Edd Robinson 22798fa290 fix(storage): ensure all block data returned
This commit prevents multiple blocks for the same series key having
values truncated when they are being read into an empty buffer.

The current cursor reader code has an optimisation that incorrectly
assumes the incoming array will be limited to 1,000 values (the maximum
block size), but arrays can contain values from multiple matching
blocks.
2020-01-21 18:00:11 +00:00
Sean Brickley fe55d728f0 fix(tsm1): Compaction log error 2020-01-13 20:05:05 -05:00
tmgordeeva f1d26652e9
fix(storage): skip TSM files with block read errors (#15885)
* fix(storage): skip TSM files with block read errors

When we find a bad TSM file during compaction, propagate the error up and move
the bad file aside. The engine will disregard the file so the next compaction
will not hit the same error.
2019-12-13 15:05:39 -08:00
Edd Robinson f7f19c5904 refactor(storage): add tombstone extension 2019-11-17 14:43:38 +00:00
Edd Robinson 196e46982b
Merge pull request #15861 from influxdata/BP-1.8-er-tsi-not-equal
fix(tsi1): index defect with negated equality filters
2019-11-13 18:15:01 +00:00
David Norton 102fcd671b fix(tsm1): make Digest() safe for concurrent use
This change adds a lock around digest creation so that it is safe for
concurrent calls. Prior to this change, calls from multiple goroutines
resulted in "Digest aborted, problem renaming tmp digest" errors.
2019-11-12 18:02:41 -05:00
Edd Robinson cac4c8956c fix(tsi1): index defect with negated equality filters
Fixes #15859

This commit fixes a defect in the TSI index where a filter using the
negated equality operator would result in no matching series being
returned for series stored within the `IndexFile` portions of the index.

The root cause of this was due to missing legacy-handling code in the
index for this particular iterator.
2019-11-12 15:10:42 +00:00
Jonathan A. Sternberg 40a7a577fc
Update flux version to v0.50.2
This upgrades the flux version to v0.50.2.

The secret service, which is used for alerts, is not included. The
`to()` function is also still not included.
2019-10-29 16:19:14 -05:00
elbehery a4bb1083f2 fix(storage): Renaming corrupt data files fails
fixes#14107
2019-10-28 17:32:58 +01:00
Ben Johnson 14d45c913c
fix(tsi1): replace TSI compaction wait group with counter 2019-09-18 11:47:37 -06:00
maxunt 021dd3114d
Merge pull request #14245 from influxdata/mu-tsm-clean-1.8-14058
Clean tmp tsm files when replace fails
2019-08-08 17:02:01 -07:00
Jacob Marble b731cc5e60
feat(storage): Limit concurrent series partition compaction (#14240)
* feat(storage): Limit concurrent series partition snapshots

* feat: make concurrency configurable

* fix: integrate review feedback

* refactor: rename config value
2019-07-30 10:34:06 -07:00
Edd Robinson 0ff7fb96b1
Merge pull request #14266 from influxdata/er-fix-fields
fix(storage): Fix issue where fields re-appear
2019-07-09 15:47:47 +01:00
Max U c6c0a5d3b1 change log level from info to error 2019-07-08 14:52:53 -04:00
Mark Rushakoff 2dfafe822c Remove stray fmt.Println in tsm1.StringArrayEncodeAll
It was introduced in #13699.

Updates #14265.
2019-07-05 15:58:53 -07:00
Edd Robinson f4413d726b test(storage): skip flaky test 2019-07-05 15:07:09 +01:00
Edd Robinson 9bfd1119b9 fix(storage): Fix issue where fields re-appear
Fixes #10052

This commit fixes an issue where field keys would reappear in results
when querying previously dropped measurements.

The issue manifests itself when duplicates of a new series are inserted
into the `inmem` index. In this case, a map that tracks the number of
series belonging to a measurement was incorrectly incremented once for
each duplication of the series. Then, when it came time to drop the
measurement, the index assumed there were several series belonging to
the measurement left in the index (because the counter was higher than
it should be). The result of that was that the `fields.idx` file (which
stores a mapping between measurements and field keys) was not truncated
and rebuilt. This left old field keys in that file, which were then
returned in subsequent queries over all field keys.
2019-07-05 12:24:03 +01:00
Max U 9091d72ba7 initial commit for 1.8 2019-07-02 13:18:20 -04:00
Edd Robinson 43e144a923 fix(storage): ensure WAL size correctly set on startup 2019-06-28 16:20:45 +01:00
Edd Robinson ecff62b9e4 test(storage): add test for reproducing #14229 2019-06-28 16:18:32 +01:00
Jonathan A. Sternberg 7ca4e644f1
Update flux version to v0.33.2 (#14208)
The flux in influxdb has been upgraded to use v0.33.2. A lot of
interfaces for the storage engine were changed during this so code had
to change to accomodate the new interfaces and remove the old ones.

Included in this commit is a patch file for the changes that were made.
A patch was generated for the following packages:

* `flux/stdlib/influxdata/influxdb`
* `storage/reads`
* `tsdb/cursors`

These are the three packages that are in common with version 2 of the
database and the first of these packages contains the specific
implementations that are used for version 1.

It is very possible that the next time we upgrade this, the patch will
not apply cleanly just like it wouldn't have applied cleanly to this
update. The patch is mostly meant to document exactly what changed
during the copy over to help ensure we don't forget things when adapting
the interfaces.

Add a patch file to hopefully make this easier in the future
2019-06-27 13:52:02 -05:00
Stuart Carnie a0f7c15b08
chore: Fix constant for 32-bit architecture 2019-06-07 11:00:13 -07:00