Commit Graph

77 Commits (14d059ac4d8b0893a4ebee1c7d5f0db3e1b46821)

Author SHA1 Message Date
Ben Johnson 9237ee6a40
fix(tsi1): Remove TSI cardinality stats cache 2019-09-04 14:48:22 -06:00
Edd Robinson 030083e1a3 perf(storage): optimistically check compactions 2019-09-04 17:38:13 +01:00
Ben Johnson 729558d64b
fix(tsdb): Replace TSI compaction wait group with counter.
Previously the TSI partition would panic if a compaction was
started while `Wait()` was waiting. This commit removes the previous
wait group and replaces it with a simple counter. The `Wait()`
function now polls the counter until it reaches zero.
2019-09-02 09:37:35 -06:00
Max U 36d3a6ea82 refactor(tsi1): address comments to clean up tool 2019-08-23 14:08:00 -07:00
Max U 9fc99c2724 feat(tsi1): port the dump-tsi tool to 2.x 2019-08-23 14:07:30 -07:00
Edd Robinson 9f3cbdc80e test(storage): add benchmark for series creation
This benchmark exercises creating (or checking if series need creating)
in the TSI index and the Series File.
2019-08-12 13:50:02 +01:00
maxunt 757fb4f80c
Merge pull request #14280 from influxdata/er-rename
feat(fs): API for replacing os calls
2019-08-07 11:33:57 -07:00
Max U ad188d6465 refactor(tsi1): remove extraneous logging 2019-08-05 13:21:13 -07:00
Max U 64747e9781 refactor(tsi1): address config changes to report-tsi tool 2019-08-05 10:03:32 -07:00
Adam Perlin 4fef1683a0 refactor(tsi1): address review comments for report-tsi tool 2019-07-26 16:21:11 -07:00
Adam Perlin a0f4d714ea chore(tsi1): rename tsi1_report.go -> report.go 2019-07-26 11:17:02 -07:00
Adam Perlin d47a578258 fix(tsi1): map org to bucket in report-tsi tool so output is more useful 2019-07-26 11:17:02 -07:00
Adam Perlin 7ce1b8109f chore(tsi1): Clean up flags and naming in report-tsi tool; add comments 2019-07-26 11:16:59 -07:00
Max U 9bd6200f15 fix(tsi1): make mergeable 2019-07-26 11:16:12 -07:00
Max U 2c1f3e2987 fix(tsi1): remove obnoxious log messages 2019-07-26 11:16:12 -07:00
Max U aa2f7a8ff7 feat(tsi1): add a --top flag for limiting output, output now sorted 2019-07-26 11:12:15 -07:00
Adam Perlin 32b283d25a feat(tsi1/report): Add ability to filter by measurement; add additional maps for efficient retrieval of total org/bucket cardinalities 2019-07-26 11:12:15 -07:00
Max U 5e5fa96c5b feat(tsi1): add flags for --org-id and --bucket-id 2019-07-26 11:12:15 -07:00
Max U bfd38d93d8 feat(tsi1): provide API tooling for use in testing 2019-07-26 11:12:15 -07:00
Max U 8f99d20deb feat(tsi1): port report-tsi tool to influxdb 2.x 2019-07-26 11:12:15 -07:00
Max U eb6d0f4478 feat(tsi): report cardinality for all indexes, still needs to be cleaned
Fix iteration logic and clean up
2019-07-26 11:12:00 -07:00
Max U 36e578122e feat(tsi): placeholder 2019-07-26 11:11:22 -07:00
Max U 41cc23cc35 fix(tsi1): clean up some error checking 2019-07-26 11:10:47 -07:00
Max U b9ede87508 fix(tsi): error trace for engine failure, not working 2019-07-26 11:09:40 -07:00
Max U 2202d727da fixes merge conflicts 2019-07-08 14:07:04 -04:00
maxunt ca5a599261
Merge branch 'master' into er-rename 2019-07-08 13:42:24 -04:00
Max U 39f51969e9 replaced os.Create calls w API calls to fs.CreateFile, includes unit test 2019-07-08 13:01:42 -04:00
Max U c669a32ff3 change fs.RenameFile to fs.RenameFileWithReplacement when compacting partition stats files as they must go to the same place 2019-07-03 14:23:21 -04:00
Max U fe748128e3 replaces os.Rename calls w api calls to fs.RenameFile. tests now are failing 2019-07-03 13:14:43 -04:00
Christopher Wolff a82e2cb180 chore(tsdb): skip flaky test 2019-05-30 16:29:31 -07:00
Alirie Gray 576da8f9d2 fix(swagger): add log property to task runs endpoint docs 2019-05-17 14:08:10 -07:00
Christopher Wolff 52a98aae2b chore(tsdb): skip flaky test
https://github.com/influxdata/influxdb/issues/13755
2019-05-14 12:52:37 -07:00
Jacob Marble 8c269e0153
chore(log): Put trace_id back in logs (#13712)
* chore(log): Put trace_id back in logs

* fix tests
2019-04-30 18:51:22 -07:00
Jeff Wendling 9cd7c0f7e3 tsi1: don't do verbose debug logging unless test fails 2019-04-29 14:01:45 -06:00
Jeff Wendling 59279837e5 tsi1: partition close deadlock
When a tsi1 partition closes, it waits on the wait group for compactions
and then acquires the lock. Unfortunately, a compaction may start in the
mean time, holding on to some resources. Then, close will attempt to
close those resources while holding the lock. That will block until
the compaction has finished, but it also needs to acquire the lock
in order to finish, leading to deadlock.

One cannot just move the wait group wait into the lock because, once
again, the compaction must acquire the lock before finishing. Compaction
can't finish before acquiring the lock because then it might be operating
on an invalid resource.

This change splits the locks into two: one to protect just against
concurrent Open and Close calls, and one to protect all of the other
state. We then just close the partition, acquire the lock, then free
the resources. Starting a compaction requires acquiring a resource
to the partition itself, so that it can't start one after it has
started closing.

This change also introduces a cancellation channel into a reference
to a resource that is closed when the resource is being closed, allowing
processes that have acquired a reference to clean up quicker if someone
is trying to close the resource.
2019-04-22 09:06:32 -06:00
Todd Persen 138c17f22c Fix typos in tsdb package 2019-04-17 12:55:38 -07:00
Ben Johnson 2b3ce82852
fix(tsdb): Remove TSI stats file cache
Removes the `STATS` file generated during TSI compaction as it had
potential for becoming inconsistent with the index data. Instead,
stats are recalculated on start up and on each compaction on a
per-partition basis.

Computing stats for 10M series across 10K measurements takes
approximately 0.171s.
2019-04-17 09:34:32 -06:00
Jacob Marble 603a1f26e0 use tracing.StartSpanFromContext 2019-03-07 12:12:31 -07:00
Jacob Marble b9c7ec439e
feat(influxd): Tracing refactor (#12318)
* feat(launcher): Tracing to log disabled by default

* remove traceLogger and use opentracing directly

* add Jaeger tracing

* go vet && go fmt
2019-03-04 11:48:11 -08:00
Jeff Wendling 0fae44e219 storage: fix problems with keeping resources alive
This commit adds the pkg/lifecycle.Resource to help manage opening,
closing, and leasing out references to some resource. A resource
cannot be closed until all acquired references have been released.
If the debug_ref tag is enabled, all resource acquisitions keep
track of the stack trace that created them and have a finalizer
associated with them to print on stderr if they are leaked. It also
registers a handler on SIGUSR2 to dump all of the currently live
resources.

Having resources tracked in a uniform way with a data type allows us
to do more sophisticated tracking with the debug_ref tag, as well.
For example, we could panic the process if a resource cannot be
closed within a certain time frame, or attempt to figure out the
DAG of resource ownership dynamically.

This commit also fixes many issues around resources, correctness
during error scenarios, reporting of errors, idempotency of
close, tracking of memory for some data structures, resource leaks
in tests, and out of order dependency closes in tests.
2019-02-28 10:22:01 -07:00
Jeff Wendling 26ca30e97a Ensure that cached series id sets are Go heap backed 2019-02-12 16:33:35 -07:00
Ben Johnson cf29b6bca4
Convert TagValueSeriesIDCache to use string fields 2019-02-12 14:45:38 -07:00
Edd Robinson 1188d75a99
Merge pull request #11202 from influxdata/er-tsi-times
Add skeleton TSI design doc
2019-01-28 11:54:59 -08:00
Edd Robinson 19a36e0dc7 Remove copy-on-write when caching bitmaps
In the case of caching TSI bitmaps belonging to immutable .tsi files,
the underlying bitset data can be mmapped. It is possible, though rare,
for this data to be unmapped (e.g., via a TSI compaction) but for the
cached bitmap to be subsequently read. This leads to a segfault.

This only happens when copy-on-write is set to true on the roaring
bitmap, because in that case only the internal pointers are cloned.

This change will reduce the TSI cache performance by around 10%, which I
have deemed to account for only a few microseconds typically.
2019-01-25 13:38:22 +00:00
Edd Robinson 045bb64c5e Add skeleton TSI design doc 2019-01-17 12:22:08 +00:00
Edd Robinson 810b5d9281 Bulk log file delete
This commit adds a method to delete many series ids from the LogFile in
bulk, reducing the number of fsyncs required.
2019-01-15 11:45:12 +00:00
Edd Robinson b025d9afa9 Improve efficiency of TSI index series drop
This commit improves the performance of a mass delete on the TSI index
by deleting at the measurement level instead of deleting each series
individually.
2019-01-14 12:46:55 +00:00
Edd Robinson 7ee4f499e1 Clarify best method of set difference 2019-01-14 12:46:53 +00:00
Edd Robinson 20a8528337 Ensure TSI bitset cache cleaned up on m drop 2019-01-14 11:23:13 +00:00
Mark Rushakoff d73d73c0d4 chore: rename imports from platform to influxdb
I did this with a dumb editor macro, so some comments changed too.

Also rename root package from platform to influxdb.

In interest of minimizing risk, anyone importing the root package has
now aliased it to "platform" so that no changes beyond imports were
necessary in those files.

Lastly, replace the old platform module to local path /dev/null so that
nobody can accidentally reintroduce a platform dependency while
migrating platform code to influxdb.
2019-01-09 20:51:47 -08:00