Commit Graph

447 Commits (510ee2c790b006f2d9588d09081da9e504f43a68)

Author SHA1 Message Date
Jon Seymour 510ee2c790 tsm: cache: during writes, update the memSize statistic outside the lock
Since we are not locking but relying on atomic arithmetic,
use Add rather than Set. Will also result in slightly less garbage
being created.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-22 08:26:35 +11:00
Jon Seymour 9c6efe99f1 tsm: cache: ensure all statistics are initialised on cache creation.
The intent of this change is to ensure that all statistic fields of the
resulting tsm1_cache measurement are initialized on initialization of
the cache. That way, any consumer of those measurements doesn't
have to deal with the null case.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-21 15:33:50 +11:00
Jon Seymour 6697c721fb tsm: cache: add cache throughput related statistics.
Complementing and extending the changes in #5758.

Add 2 level statistics:

  * snapshotCount
  * cacheAgeMs

Add 2 counter statistics

  * cachedBytes
  * WALCompactionTimeMs

snapshotCount can be used to measure transient write errors that are causing snapshots to accumulate

cacheAgeMs can be used to guage the level of write activity into the cache

The differences between cachedBytes stats sampled at different times can be used to calculate cache throughput rates

The ratio (cachedBytes-diskBytes)/WALCompactionTimeMs can be used calculate WAL compaction throughput.

The ratio of difference between first and last WAL compaction time over the interval
length is an estimate of percentage of cache throughput consumed.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2016-02-20 22:18:57 +11:00
Mark Rushakoff 602043e11b Add disk stats for FileStore 2016-02-19 16:37:34 -08:00
Mark Rushakoff d99c09cedd Add stats for current and old WAL segment sizes 2016-02-19 16:37:34 -08:00
Mark Rushakoff e76967efb6 Add stats to tsm1.Cache 2016-02-19 16:37:34 -08:00
Joe LeGasse dc8ed7953d Remove custom binary-conversion functions
Also cleaned up some excess allocations, and other cruft from the code
2016-02-18 13:56:35 -05:00
Ben Johnson f7e04abef7 remove NaN from query engine
This commit removes `math.NaN` returns from float iterators.
2016-02-17 14:11:31 -07:00
liang@qiniu.com 1ad0f933f4 Remove redundant wal files 2016-02-16 20:45:13 +08:00
Ben Johnson 0b3d367e5c Merge pull request #5623 from influxdata/jw-query-panic
Fix panic: runtime error: index out of range
2016-02-10 14:59:04 -07:00
Jason Wilder 0ce6dd1304 Fix panic: runtime error: index out of range
There was a fix in 5b1791, but is not present in the current branch likely due to a rebase issue.
The current code panics with a query like:

select value from cpu group by host order by time desc limit 1

This fixes the panic as well as prevents #5193 from re-occurring.  The issue is that agressively
closing the cursors clears out the seeks slice so re-seeking will fail.
2016-02-10 14:00:58 -07:00
Justin Nuß 82c276756a Lint tsdb and tsdb/engine package 2016-02-10 21:33:46 +01:00
Ben Johnson d9a6a7340f add canonical paths 2016-02-10 11:30:52 -07:00
Ben Johnson 5a0d1ab7c1 rename influxdb/influxdb to influxdata/influxdb
This commit changes all the import and URL references from:

    github.com/influxdb/influxdb

to:

    github.com/influxdata/influxdb
2016-02-10 10:26:18 -07:00
Jonathan A. Sternberg d1f7c445e7 Modify iterators to work across shards
Aux iterators now ask the iterator creator what series will be returned
and determine which aux fields to create based on the results.

The `tsdb.Shards` struct also creates a call iterator around the
iterators returned from each shard.
2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg c2d1206177 Implement the fill iterator
Fill requires an additional function for IteratorCreator to retrieve the
series that will be returned from the iterator. When fill is required
for an aggregate, the IteratorCreator will be asked what series will be
returned by the created iterator.
2016-02-10 09:40:29 -07:00
Ben Johnson 6204350d65 fix math operations 2016-02-10 09:40:27 -07:00
Ben Johnson b4cb770a7f refactor aux iterators 2016-02-10 09:40:27 -07:00
Ben Johnson b8918a780c integer support 2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg 583477064c Check for `tsdb.EOF` when looking for the lowest timestamp of aux fields 2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg 34f14424dd Filter tags from the condition when building cursors on tsm1 2016-02-10 09:40:25 -07:00
Ben Johnson 00806de9b8 refactor query engine 2016-02-10 09:40:25 -07:00
Ben Johnson cde973f409 refactor query engine 2016-02-10 09:40:24 -07:00
Gabriel Levine 7d4217ab97 enabled golint for tsdb/engine/wal.go and wal_test.go and updated changelog. 2016-02-09 10:29:09 -05:00
Jason Wilder 2b3c640695 Fix reading too far in fileAccess.readBytes
Fixes #5566
2016-02-08 09:08:57 -07:00
Jason Wilder 28ae8b6fe0 Merge pull request #5434 from runner-mei/tsm_tombstone_windows
fix TSMReader.Delete() and all unit tests is pass in the windows
2016-02-04 16:27:26 -07:00
Jason Wilder b635e516e5 Merge pull request #5485 from runner-mei/patch-7
fix munmap bug in the windows
2016-02-04 13:47:51 -07:00
Jason Wilder 5a124e0e0b Merge pull request #5431 from runner-mei/patch-5
fix determine the file size
2016-02-04 10:24:05 -07:00
INADA Naoki 80a637904d tsm1: Use unixnano instead of time.Time 2016-02-03 10:05:40 +09:00
INADA Naoki 771253256b FloatValue uses unixnano instead of time.Time 2016-02-03 09:57:00 +09:00
INADA Naoki 898babf616 add float bench 2016-02-03 03:12:16 +09:00
runner.mei 4ca47103b1 fix TSMReader.Delete() and all unit tests is pass in the windows 2016-01-31 11:32:08 +08:00
runner bc992fea5e fix munmap bug in the windows
fix munmap bug in the windows

fix munmap bug in the windows

fix munmap bug in the windows

fix munmap bug in the windows
2016-01-31 10:46:46 +08:00
runner 4b7fe70cd3 fix determine the file size
fix determine the file size
2016-01-30 14:16:53 +08:00
runner.mei 53f7e03f72 fix TSMReader.Delete() and all unit tests is pass in the windows 2016-01-30 14:15:46 +08:00
Jason Wilder 924275b337 Fix panic preventing wal file truncation
Fixes #5455
2016-01-28 21:50:51 -07:00
Jason Wilder 9528c3ea70 Merge pull request #5465 from influxdata/jw-remote-writes
Optimize remote writes
2016-01-27 15:47:02 -07:00
Jason Wilder 1d165d38a9 Optimize Cache entry.add
This reduces some of the lock contention when writing to the cache.
When a new entry is created, it avoids an allocation.  It also skips
a check to see if we need to sorted if we already know it needs to sorted.
2016-01-27 14:26:42 -07:00
Ben Johnson 98baf078d0 tsm1 query performance improvements 2016-01-27 13:42:32 -07:00
Jason Wilder 372302bcbd Reduce lock contention in Cache.WriteMulti
A write-lock was taken the whole time, but we only need the write
lock at the end.
2016-01-25 16:48:34 -07:00
Jason Wilder 5bee8880db Reduce lock content in engine.WritePoints
Writing the snapshot would deduplicate the snapshot points
while still holding the engine write-lock.  This can be expensive
under high load and cause writes to back up and OOM the server.

Instead, grab the snapshot under the lock and dedup it after releasing
the lock.

Possible fix for #5442
2016-01-25 15:37:34 -07:00
Jason Wilder 24f1bcfd20 Remove Dev prefix from tsm engine/tx 2016-01-10 16:43:36 -07:00
Jason Wilder 5b179113fc Don't close tsm cursor prematurely
We were closing the cursor when we read the last block which caused
the internal state to be cleared.  In a group by query, we seeked multiple
times so depending on the group by interval and how the data was laid out
in the blocks, we woudl close the cursor and the last block would get skipped.

Fixes #5193
2016-01-10 15:26:01 -07:00
Jason Wilder 3c45015311 Remove MAP_POPULATE
This may be causing slow restart times for systems with many large TSM files.
What I believe is happening at startup in these cases is that multiple goroutines
are started to load each TSM file concurrently.  The kernel appears to serialize
mmap calls from the same process so all of the goroutines end up getting blocked
on the actual mmap system call.  MAP_POPULATE instruct the kernel to pre-fault the
page table for the files and triggers read-ahead of the pages.  For larger, 2GB files,
this makes the mmap call more expensive and slower.  When there are many of these files
and calls it is possible to fill all available memory with pagecache.  In this case,
the OS will end up pre-faulting pages from one file and have to remove pages that it just
loaded from another files causing slowness.  MAP_POPULATE may also be cause much more data
to be pre-faulted than necessary.  To load a file, we just need to scan the index at the end
of the file.  MAP_POPULATE is likely causing the whole file to be loaded when it won't actually
be accessed for a while (or at all).

Might fix issue #5311.
2016-01-08 08:45:27 -07:00
Jason Wilder 756421ec4a Look for fully compacted block in addition to max size during compaction
Some data shapes would cause files to grow larger than the max size more
quickly which resulted in them getting skipped by the full compaction planner
at times.  Some datasets that could make this happen are very large keys or
very large numbers of keys (10M).  When this happened, multiple max sized
files would accumulate but the blocks would not be full.  When the shard went
cold for writes, these files would get recompacted down to the optimal size, but
a lot of space would be wasted in the mean time.
2016-01-07 15:18:42 -07:00
Jason Wilder faf8ee17fa Fix typo 2016-01-06 12:53:04 -07:00
Jason Wilder d2b7c03175 Re-use the series key
Avoid allocating the string twice.
2016-01-06 12:52:13 -07:00
Jason Wilder 2f7a0090c1 Don't allocate a pre-sized buffer for each cursor
This is contributing to some of the high memory usage on queries and possibly
some OOMs.  This is slightly slower, but removing it allows some fairly large
count queries over 5M series to complete instead of crashing the process using
tsm1 engine.
2016-01-06 10:50:38 -07:00
Jason Wilder 6f577cfef5 Reduce allocations when compacting
Key() returned the key and the entries.  We did not always need the
entries so they would be allocated and ignored.  Added a KeyAt func
that just returns the key to avoid the unnecesary entries allocation.
2016-01-05 16:16:44 -07:00
Jason Wilder 9a9ccab560 Reduce allocation in wal encoder
Use sync.Pool for some temporary buffers used while encoding instead of
allocatin new ones each time.  Also increased the default buffer size which
might be too small.  Probably need to make this a config var.
2016-01-05 16:12:25 -07:00