Commit Graph

1218 Commits (ae4a1a4d0060bffaff02e73fb9a931d53a42f032)

Author SHA1 Message Date
Jonathan A. Sternberg 86bd97f3b9 Switch SHOW MEASUREMENTS and SHOW TAG VALUES to directly access the tsdb.Store
The `SHOW MEASUREMENTS` and `SHOW TAG VALUES` cannot go through the
query engine to get the speed they need. They also only need access to
the database index and do not need access to specific shards. This
removes the query rewriting that was done to turn these two queries into
a select statement and reimplements them inside of the coordinator as an
interface on the TSDBStore.
2016-07-28 17:38:11 -05:00
Mark Rushakoff f34a7430e3 Fix length of (*DatabaseIndex).SeriesKeys()
Previously, it would return as many empty strings in the first half of
the slice as valid values at the end of the slice.
2016-07-27 16:07:39 -07:00
Jason Wilder 7c3d1aac68 Simplify purger.add logic 2016-07-26 13:02:08 -06:00
Jason Wilder cab84ae279 Prevent concurrent compactions from stepping on each other
Normally, compactions do not conflict on the files they are compacting.
If the full cold threshold is set very low, it can cause conflicts where
two compactions compact the same files.  The full compaction was the
only place this could happen as it's planning is greedy.

To make this safer for concurrent execution, the compaction tracks which
files are current being compacted and prevents any new compactions from
starting if the file set overlaps.

Fixes 
2016-07-26 12:58:25 -06:00
Jason Wilder ded6e40d47 Remove lastPlanCheck var
This causes full compactions to not run if the server is running, but
after a restart they do run.
2016-07-26 12:58:25 -06:00
Jason Wilder 2f78c4ec83 Fix race when creating temp file
Using os.O_EXCL is safer than checking and then creating the file.
2016-07-26 12:58:25 -06:00
Cory LaNou 063675b928 updates to make snappy compression tests work again 2016-07-22 14:33:20 -05:00
Cory LaNou 968d322d6d finish tsm file exporter 2016-07-21 17:20:51 -05:00
Jason Wilder fb5a143b08 Fix typos 2016-07-21 12:13:04 -06:00
Jason Wilder 13147efb24 Close underlying cursors when closing iterators
If a query is interrupted via kill query, the tsm files managed
by the file store purger would never get removeed because
KeyCursor.Close was never called.

KeyCursor.Close should always be called now.
2016-07-21 12:13:04 -06:00
Jason Wilder 822f409b31 Allow queries to complete before closing TSM files
If a query was running against a file being compacted, we close the file
and the query would end wherever it had read up to.  This could result
in queries that randomly lost data, but running them again showed the
full results.

We now use a reference counting approach and move the in-use files out
of the way in the filestore and allow the queries to complete against
the old tsm files.  The new files are installed and new queries will
use them.

Fixes 
2016-07-21 12:13:04 -06:00
Cory LaNou fd86670518 remove limiter from walkShards 2016-07-21 11:23:31 -05:00
Edd Robinson f37e726869 Add trace logging statements to tsdb 2016-07-21 11:14:29 +01:00
Edd Robinson 44231abcbd Add trace logger controlled via DataLoggingEnabled 2016-07-21 11:14:29 +01:00
Edd Robinson 217bd4de84 Disable trace logging by default 2016-07-21 11:14:29 +01:00
Edd Robinson 83cc580ff8 Tidy up logging 2016-07-21 11:14:29 +01:00
Mark Rushakoff 518bd3b565 Micro-optimize BooleanDecoder for 20% speedup
benchmark                          old ns/op     new ns/op     delta
BenchmarkBooleanDecoder_2048-4     9954          7846          -21.18%

benchmark                          old allocs     new allocs     delta
BenchmarkBooleanDecoder_2048-4     0              0              +0.00%

benchmark                          old bytes     new bytes     delta
BenchmarkBooleanDecoder_2048-4     0             0             +0.00%
2016-07-20 08:43:05 -07:00
Mark Rushakoff 523aea715a Protect against bounds errors in FloatDecoder 2016-07-19 15:59:27 -07:00
Mark Rushakoff e483689563 Protect against bounds errors in BooleanDecoder 2016-07-19 15:59:27 -07:00
Mark Rushakoff 35e3adc890 Protect against bounds errors in IntegerDecoder 2016-07-19 15:43:27 -07:00
Mark Rushakoff 42b35ca068 Protect against bounds errors in TimeDecoder 2016-07-19 15:43:27 -07:00
Mark Rushakoff be589a6760 Protect against bounds errors in StringDecoder 2016-07-19 15:43:27 -07:00
Mark Rushakoff 5b549ffdfe Handle bounds errors in UnpackBlock 2016-07-19 15:43:27 -07:00
Mark Rushakoff 39f12e376c Defend against some boundary errors in TSM reading 2016-07-19 15:43:27 -07:00
Mark Rushakoff 28f31b4a0c Add test cases to repro corruption panics 2016-07-19 15:36:17 -07:00
Jason Wilder c31f0c25b4 Fix duplicate series getting created
There was a race where the same series would get added to the in-memory
index for a measurement more than once.  This would result in the same
series being returned more than once during queries causing duplicate
results.  The issue was that we check for the series under the read
lock, but did not check again under the write lock where there was
a small window where the series could be added by another goroutine.

We now check for the series under the write lock.

Fixes 
2016-07-18 16:46:36 -06:00
Jason Wilder 757f31bd45 Fix panic:runtime error: invalid memory address or nil pointer dereference
github.com/influxdata/influxdb/tsdb.(*Shard).FieldDimensions(0xc820244000, 0xc821b70fb0, 0x1, 0x1, 0xc822b9cc00, 0xc822b9cc30, 0x0, 0x0)
    /Users/jason/go/src/github.com/influxdata/influxdb/tsdb/shard.go:588 +0xa62
github.com/influxdata/influxdb/tsdb.(*shardIteratorCreator).FieldDimensions(0xc8202b6078, 0xc821b70fb0, 0x1, 0x1, 0xc822b9cbd0, 0x0, 0x0, 0x0)
    /Users/jason/go/src/github.com/influxdata/influxdb/tsdb/shard.go:818 +0x53
github.com/influxdata/influxdb/influxql.IteratorCreators.FieldDimensions(0xc821b71250, 0x1, 0x1, 0xc821b70fb0, 0x1, 0x1, 0xc822b9cba0, 0xc822b9cbd0, 0x0, 0x0)
    /Users/jason/go/src/github.com/influxdata/influxdb/influxql/iterator.go:639 +0x15a
github.com/influxdata/influxdb/influxql.(*IteratorCreators).FieldDimensions(0xc822a32ae0, 0xc821b70fb0, 0x1, 0x1, 0x20, 0x18, 0x0, 0x0)
    <autogenerated>:163 +0xd3
2016-07-18 16:35:33 -06:00
Jonathan A. Sternberg 30efa2d922 Merge pull request from influxdata/js-6950-show-measurements-performance
Optimize SHOW MEASUREMENTS so it consults the database index directly
2016-07-18 15:23:17 -05:00
Jason Wilder b692ef4f48 Rename throttle package to limiter 2016-07-18 12:00:58 -06:00
Jonathan A. Sternberg 4121590b01 Optimize SHOW MEASUREMENTS so it consults the database index directly
SHOW MEASUREMENTS doesn't need to visit every shard in the open source
version since all of them contain the same database index.
2016-07-18 12:53:23 -05:00
Jason Wilder c2370b437b Limit in-flight wal writes/encodings
A slower disk can can cause excessive allocations to occur when
writing to the WAL because the slower encoding and compression occurs
before taking the write lock.  The encoding/compression grabs a large
byte slice from a pool and ultimately waits until it can acquire the
write lock.

This adds a throttle to limit how many inflight WAL writes can be queued
up to prevent OOMing the processess with slower disks and heavy writes.
2016-07-17 23:53:12 -06:00
Jason Wilder 46fdcba6e3 Remove compaction enabled logging
Too verbose
2016-07-17 23:53:12 -06:00
Jason Wilder 2fa28ba1d3 Don't log error when compactions are aborted 2016-07-17 23:53:12 -06:00
Jason Wilder b48d88ce9e Abort running compactions when series are deleted
If a delete is issued while a compaction is running, the a newly
deleted series could re-appear after the compaction completed. This
could occur the compaction had already written the blocks for series
that were just deleted.  When the compaction completes, the newly
written tombstone files would be deleted, essentially undeleting the
series.
2016-07-17 23:53:12 -06:00
Jason Wilder cc4a668be5 Don't return statistic if engine is closed 2016-07-17 23:53:12 -06:00
Jason Wilder 6710c69aa5 Merge pull request from influxdata/jw-drop
Speed up delete/drop statements
2016-07-15 12:41:08 -06:00
Jason Wilder 21dbe7e854 Simplify throttle type 2016-07-15 12:14:25 -06:00
Jason Wilder d1556e3964 Fix missing read locks before filtering 2016-07-15 10:08:26 -06:00
Jason Wilder ff5d61d024 Speed up delete series
Reduce lock contention and process shards in concurrently.
2016-07-14 17:31:34 -06:00
Jason Wilder 8f3ec3be43 Inline deleteShard
Only used by one caller now
2016-07-14 17:31:34 -06:00
Jason Wilder 78201e19d0 Refactor DeleteDatabase to use filter/walk funcs 2016-07-14 17:31:34 -06:00
Jason Wilder e0122efcf8 Speed up drop retention policy
Reduce the lock contention on tsdb.Store by taking a short lived
read-lock instead of a long write lock.  Also close shards in parallel
and drop the whole RP dir in bulk instead of each shard dir.
2016-07-14 17:31:34 -06:00
Jason Wilder 6d3d2f6fe9 Speed up drop measurement
Reduces the lock contention on the tsdb.Store by taking a short
read lock instead of a long write lock.  Also processes shards
in parallel instead of serially.
2016-07-14 17:31:29 -06:00
Jason Wilder 4254ad304c Merge pull request from influxdata/md-add-benchmarks
Add additional benchmarks for various schemas
2016-07-14 15:04:29 -06:00
Jason Wilder 0f5e994383 Fix panic in full compactions due to duplciate data in blocks
Due to a bug in compactions, it's possible some blocks may have duplicate
points stored.  If those blocks are decoded and re-compacted, an assertion
panic could trigger.

We now dedup those blocks if necessary to remove the duplicate points
and avoid the panic.
2016-07-14 11:32:36 -06:00
Jason Wilder 0264966f5c Add index optimize planning step
For larger datasets, it's possible for shards to get into a state where
many large, dense TSM files exist.  While the shard is still hot for
writes, full compactions will skip these files since they are already
fairly optimized and full compactions are expensive.  If the write volume
is large enough, the shard can accumulate lots of these files.  When
a file is in this state, it's index can contain every series which
causes startup times to increase since each file must parse the full
set of series keys for every file.  If the number of series is high,
the index can be quite large causing large amount of disk IO at startup.

To fix this, a optmize compaction is run when a full compaction planning
step decides there is nothing to do.  The optimize compaction combines
and spreads the data and series keys across all files resulting in each
file containing the full series data for that shard and a subset of the
total set of keys in the shard.

This allows a shard to only store a series key once in the shard reducing
storage size as well allows a shard to only load each key once at startup.
2016-07-14 11:32:36 -06:00
Jason Wilder 5ee20e04a8 Fix compaction level planner
Large files created early in the leveled compactions could cause
a shard to get into a bad state.  This reworks the level planner
to handle those cases as well as splits large compactions up into
multiple groups to leverage more CPUs when possible.
2016-07-14 11:14:09 -06:00
Jonathan A. Sternberg 12a33fe0d3 Add stats and diagnostics to the TSM engine
Track the number of TSM files in the file store and keep engine
statistics related to the number of TSM compactions.
2016-07-07 19:35:55 -05:00
Jonathan A. Sternberg 837a9804cf Refactoring the monitor service to avoid expvar
Truncate the time interval output of the monitor service to be on even
time intervals rather than on every minute based on the start time. This
normalizes the output from the monitor service.
2016-07-07 11:13:58 -05:00
Jason Wilder 2f82d9a525 Truncate the slice when merging the caches 2016-07-05 12:12:21 -05:00