Commit Graph

297 Commits (09b0258ab45cd5a505d4a5b6c575ad10e04e3194)

Author SHA1 Message Date
Stuart Carnie dee8977d2c
chore: move v2/v1/tsdb → v2/tsdb 2020-08-26 10:46:47 -07:00
Mark Rushakoff f2898d1992 Wipe out workspace in preparation for v2 merge
"Knock knock."

"Who's there?"

"InfluxDB Veet."

...
2019-01-11 10:38:50 -08:00
Jeff Wendling 9f0cd683b9
Merge pull request #10516 from influxdata/jmw-conflict-concurrency
tsdb: conflict based concurrency resolution
2018-11-29 14:14:24 -07:00
Ben Johnson 298eddb82c
Skip and warn series files in retention policy directory. 2018-11-28 11:20:18 -07:00
Jeff Wendling 4cad51a604 tsdb: conflict based concurrency resolution
There are some problematic races that occur when deletes happen
against writes to the same points at the same time. This change
introduces guards and an epoch based system to coordinate these
modifications.

A guard matches a point based on the time, measurement name, and
some conditions loaded from an influxql expression. The intent
is to be as precise as possible without allowing any false
neagatives: if a point would be deleted, the guard must match it.
We are allowed to match more points than necessary, at the cost
of slowing down writes.

The epoch based system keeps track of outstanding writes and
deletes and their associated guards. When a delete operation
is going to start, it waits until all current writes are
done, and installs its guard, blocking all future writes that
contain points that may conflict with the delete. This allows
writes to disjoint points to proceed uncontended, and the
implementation is optimized for assuming there are few
outstanding deletes. For example, in the case that there are no
deletes, a write just has to take a mutex, bump a counter, and
compare a value against zero. The epoch trackers are per shard,
so that different shards never have to contend with one another.
2018-11-21 19:19:53 -07:00
Jeff Wendling 030adf4bd5 tsdb: don't allow deletes to a database in mixed index mode
TSI1 and inmem indexes have different properties during deletes.
Specifically, inmem shares a global index across all shards, where
every tsi1 index is contained to a specific shard. When deleting
a series, it may cause the last reference to the series across all
shards to be dropped, necessitating a removal from the series file.
Since the inmem index shares the index across all shards, removing
the series when it's removed from the series file is sufficient.

However, in the case of a mixed index database, if the last shard
is a TSI1 shard, the other inmem indexes are not available when we
discover that it was the last reference to the series. This ends
up leaving the series in the inmem index without a series id in
the series file, causing all sorts of misbehavior.

Rather than continue curling ourselves into a ball to try to fix
this unsupported mode, give a helpful error message to the user
that they must run their database in a non-mixed index mode to
allow deletes.
2018-11-21 18:18:38 -07:00
Edd Robinson cade59e253 Fix panic in IndexSet
This commit fixes a panic where a concurrent removal of a shard and meta
query could cause a `nil` index to be added to the IndexSet`.
2018-10-26 18:23:54 +01:00
Stuart Carnie 9520b8d956 fix(tsdb): Fix race calling filterShards outside a lock
Move filterShards inside the lock, as it enumerates the shards map,
which can result in data race when the map is written concurrently.
2018-10-17 14:14:53 -07:00
Edd Robinson f52de2d1e7 Ensure orphaned series removed from inmem index
This commit ensures that any orphaned series (series that are to be
removed and no longer are referenced anywhere in the database) are
removed from the `inmem` index when a shard is dropped.
2018-08-21 15:00:35 +01:00
Edd Robinson dece5b847f Refactor index names 2018-08-21 14:32:30 +01:00
Jacob Marble 786d637780 tsdb: Cleanup compaction throughput code 2018-08-07 11:12:41 -07:00
Zach Goldstein 0ef3752a1a Add configuration parameter to expose rate limit for TSM compaction.
Closes: 9938
2018-08-07 10:05:36 -04:00
Edd Robinson 9eece563b1 Simplify loops 2018-08-05 15:16:33 +01:00
Jeff Wendling 63fbf53699
Merge pull request #10063 from influxdata/jmw-extra-log-context
Make store include context in logs
2018-07-18 11:53:22 -06:00
Edd Robinson 95db829631 Remove default max concurrent compaction limit
PR #9204 introduced a maximum default concurrent compaction limit of 4.
The idea was to reduce IO utilisation on large systems with many cores,
and high write load. Often on these systems, disks were not scaled
appropriately to to the write volume, and while the write path could
keep up, compactions would saturate disks.

In #9225 work was done to reduce IO saturation by limiting the
compaction throughput. To some extent, both #9204 and #9225 work towards
solving the same problem.

We have recently begun to notice larger clusters to suffer from
situations where compactions are not keeping up because they have been
scaled up, but the limit of 4 has stayed in place. While users can
manually override the setting, it seems more user friendly if we remove
the limit by default, and set it manually in cases where compactions are
causing too much IO on large boxes.
2018-07-18 17:27:49 +01:00
Edd Robinson 55ffeb563a Tidy up logging of compaction settings 2018-07-18 17:26:34 +01:00
Jeff Wendling 7bdbe26534 Make store include context in logs
If some error or message is in the context of some shard or database
be sure to include it in the message.
2018-07-18 10:22:53 -06:00
David Norton 6016a80997 allow tag keys to contain underscores 2018-07-17 09:39:08 -04:00
Stuart Carnie 88cd9f3fcf pr(influx-tools): Improvements per PR review 2018-06-13 10:29:59 -07:00
Stuart Carnie 7e998779e6 feat(tsdb/store): Option to disable compactions for offline tools
Allows an offline tool to open the tsdb.Store with compactions disabled.
2018-06-13 10:29:59 -07:00
Stuart Carnie 7abf3ec048 fix(tsdb/store): Fix hang when closing Store if monitor is disabled 2018-06-13 10:29:59 -07:00
Ben Johnson d3e3b05a49
Add tsm1 open limiter
This commit restricts the number of TSM1 files that can be opened
concurrently across the entire `tsdb.Store`. There is currently
a limit for the number of shards that can be opened concurrently,
however, this limit does not help when the number of CPU cores
is higher than the number of shards. Because TSM1 files have a 2GB
limit and there is no limit on the number of files per shard,
extremely large shards (1TB+) can load 1,000s of files simultaneously.
2018-05-29 10:21:53 -06:00
Jacob Marble 735aa2d7dc Add SeriesIDSet() to Index interface 2018-05-18 09:22:43 -07:00
Jacob Marble 7f8b7af61e
Cleanup index memory footprint counting code (#9828)
* Fix IndexSet.DedupeInmemIndexes

* Cleanup index memory footprint code
2018-05-15 11:25:19 -07:00
Jacob Marble 0763d1789e Get inmem index bytes without double-counting 2018-05-10 11:33:52 -07:00
Jacob Marble 7de2dcd3d9 TSM: TSMReader.Close blocks until reads complete 2018-04-30 13:46:03 -07:00
Edd Robinson 0b4a403679 Provide warning when mixed index used on db 2018-04-25 13:57:08 +01:00
Edd Robinson 32e195860b Log index type when opening shard 2018-04-25 13:02:09 +01:00
Stuart Carnie 14dcc5d6e7 PR feedback 2018-04-19 18:05:55 -07:00
Stuart Carnie e7389b18c0 tsdb: add additional engine options
* filters allow specific combinations of database, retention policy and
  shard groups to be opened. This was added to reduce the start-up time
  of the export tool and limit the memory usage.
2018-04-19 18:05:55 -07:00
Ben Johnson d0688201ba
Fix missing Store.Close() unlock. 2018-03-06 10:36:44 -07:00
Stuart Carnie a74d296200 use underscore vs period, fix doc comment, add database name to CQ 2018-02-26 10:08:43 -07:00
Stuart Carnie d135aecf02 Generate trace logs for a number of significant influx operations
* tsdb Store.Open traces all events related to opening files
    * op.name : tsdb.open
* retention policy shard deletions
    * op.name : retention.delete_check
* all TSM compaction strategies
    * op.name : tsm1.compact_group
* series file compactions
    * op.name : series_partition.compaction
* continuous query execution (if logging enabled)
    * op.name : continuous_querier.execute
* TSI log file compaction
    * op_name: index.tsi.compact_log_file
* TSI level compaction
    * op.name: index.tsi.compact_to_level
2018-02-21 15:08:49 -07:00
Jonathan A. Sternberg d38413a849
Merge pull request #9454 from influxdata/js-structured-logging
Update logging calls to take advantage of structured logging
2018-02-21 09:14:40 -06:00
Jonathan A. Sternberg 0727ffbf4e
Mark a shard as in process of being deleted
Without this, deleting a shard could trigger things so that a write
would attempt to create the shard again before it was actually deleted.
2018-02-20 12:17:30 -07:00
Jonathan A. Sternberg 2bbd96768d Update logging calls to take advantage of structured logging
Includes a style guide that details the basics of how to log.
2018-02-20 10:04:19 -06:00
Edd Robinson 433e643364 Fix data race when collecting sketches 2018-02-15 11:16:32 +00:00
Edd Robinson e5c8fd9dc5 Ensure nil sketches never returned 2018-02-09 15:29:42 +00:00
Edd Robinson 544329380f
Add empty series sketches back to tsi1 index
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.

The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.

This commit does not provide working sketches for post-compaction on the
tsi1 index.
2018-02-07 14:52:13 -07:00
Edd Robinson 42c3adeffc simplify packages under tsdb 2018-01-21 09:41:27 -08:00
Edd Robinson 4ccb6ada69 Remove unused code/cleanup tsdb package 2018-01-20 14:06:15 +00:00
Jason Wilder 8f52e442e6 Fix deadlock in DeleteSeries
The Store.Delete series held an RLock while deleting from each shard.
While deleting, the Engine uses shardSet to see if a series is fully
deleted.  The shardSet.ForEach also takes and RLock.  If a Lock is
requested between these two calls, a deadlock occurs.

To fix, we don't need to hold an RLock for the duration of the delete
in the store as each Shard handles concurrency itself and we have a
snapshot of the shards we need to access.
2018-01-17 10:28:21 -07:00
Edd Robinson bd762380b0 Use bitsets to calculate series cardinality 2018-01-16 23:22:52 +00:00
Edd Robinson ceb3abd118 Remove series when shard rolls over
Series should only be removed from the series file when they're no
longer present in any shard. This commit ensures that during a shard
rollover, the series local to the shard are checked against all other
series in the database.

Series that are no longer present in any other shards' bitsets, are then
marked as deleted in the series file.
2018-01-16 15:58:20 +00:00
Edd Robinson e902998f4e All closes are now fast 2018-01-16 14:56:54 +00:00
Edd Robinson 8039165ab4 Ensure no double r-locking occurs in IndexSet
use. However, because the reference counting was implemented via
mutexes, it was possible to double `RLock` the series file mutex. This
allows a `Lock` to arrive in-between the two `RLock`s, (such as when
deleting the database), causing deadlock.

This commit addresses this by ensuring that from within `IndexSet`
methods, when calling other `IndexSet` methods, that they're all
unexported, and that those unexported methods never take a lock on the
series file.

Keeping series file locking in exported `IndexSet` methods only, allows
one to see any future races more easily.
2018-01-16 14:56:34 +00:00
Jason Wilder ba9a5af7eb Mark series deleted in series file
This commit adds the ability to correctly mark a series as deleted in
the global series file. Whenever a shard engine determines that a series
should be deleted, it checks with each shard's bitset for series that
are to be deleted and are no longer contained in any shard-local
bitsets.

These series are then removed from the series file.
2018-01-15 12:00:30 +00:00
Edd Robinson 286c8f4c09 Return to original DELETE/DROP SERIES semantics
This reverts commit 59afd8cc90.
2018-01-15 12:00:30 +00:00
Jason Wilder 874d5839da Don't return error for non-existent series file
When dropping series, if the series file does not exists we returned
and error.  This breaks compatibility with prior versions that would
not return an error if the series do not exists.
2018-01-14 12:53:26 -07:00
Jason Wilder 5d1f76192a Ensure series file is not closed while in use 2018-01-12 16:58:33 -07:00