Commit Graph

45 Commits (4958cdc70be25b636b1598277ba9846afbced004)

Author SHA1 Message Date
Jeff Wendling 1c0e49e002 tsm1: ensure all written tsm files are fsynced
we were asserting to an *os.File in order to call Sync, but in some
cases the file handle has been wrapped, for example with limiting.
instead, assert to minimal interfaces for the functionality we need
and attempt to add some robustness in the code that creates the
writers by using a stronger interface with a Sync method.

fixes #9991
2018-06-25 11:36:22 -06:00
Stuart Carnie 0f6e6fb9ef
Merge pull request #9192 from influxdata/sgc-writer
ensure tsmWriter#Write returns ErrMaxBlocksExceeded
2018-02-01 15:39:01 -07:00
Edd Robinson 90903fa6ed Remove unused code/cleanup engine package 2018-01-20 13:56:45 +00:00
Jason Wilder 749c9d2483 Rate limit disk IO when writing TSM files
This limits the disk IO for writing TSM files during compactions
and snapshots.  This helps reduce the spiky IO patterns on SSDs and
when compactions run very quickly.
2017-12-14 22:02:32 -07:00
Jason Wilder 9f2a422039 Use disk based TSM index more selectively
The disk based temp index for writing a TSM file was used for
compactions other than snapshot compactions.  That meant it was
used even for smaller compactiont that would not use much memory.
An unintended side-effect of this is higher disk IO when copying
the index to the final file.

This switches when to use the index based on the estimated size of
the new index that will be written.  This isn't exact, but seems to
work kick in at higher cardinality and larger compactions when it
is necessary to avoid OOMs.
2017-12-06 13:45:43 -07:00
Jason Wilder 9c1d7d00a9 Switch O_SYNC to periodic fsync
O_SYNC was added with writing TSM files to fix an issue where the
final fsync at the end cause the process to stall.  This ends up
increase disk util to much so this change switches to use multiple
fsyncs while writing the TSM file instead of O_SYNC or one large
one at the end.
2017-12-06 09:35:24 -07:00
Stuart Carnie 682705d4a7 ensure tsmWriter#Write returns ErrMaxBlocksExceeded 2017-12-01 10:33:59 -07:00
Jason Wilder f2a681c4cf Unconditionally remove file when calling Remove 2017-10-03 10:49:17 -06:00
Jason Wilder 70817350b7 Ensure temp index files are cleaned up on error 2017-10-03 10:48:14 -06:00
Jason Wilder f668b0cc3f Only use O_SYNC for tsm file writing
Doing this for the WAL reduces throughput quite a bit.
2017-10-03 10:48:13 -06:00
Jason Wilder 94aba64b88 Re-use index entries slice when writing TSM index 2017-09-21 12:48:16 -06:00
Jason Wilder 61ca1243c7 Increase index disk writer buffer 2017-09-20 09:05:30 -06:00
Stuart Carnie 92756ec0ad Reduce allocations, improve readEntries performance by simplifying loop
* callers of ReadEntries and Key API can cache allocated slice
2017-09-19 11:57:10 -07:00
Jason Wilder 820856347c Don't use disk temp file for snapshots 2017-09-11 15:29:26 -06:00
Jason Wilder 97f7857715 Remove mutex on TSMWriter
This isn't used by more than one goroutine so locks are unnecessary.
2017-09-11 15:29:26 -06:00
Jason Wilder 7388eb9499 Use disk when writing TSM index 2017-09-11 15:29:25 -06:00
Jason Wilder bc4fb0ea10 Sort index entries if necessary
These are already sorted during compaction, so switch to sorting lazily
to avoid the CPU and allocations.  This would only occur when using if
using the writer directly.
2017-09-11 15:26:25 -06:00
Jason Wilder f18dec6a4a Use sorted slice for writing TSM index
The directIndex used by the TSMWriter maintained a map of series keys
to index entries.  When the index is written to the TSM file, the keys
are sorted and then written out in order.

The reason for this is because directIndex used to be the only index
and it was optimized more for reading.  The reading has been replaced
by the indirectIndex so the map of keys ends up wasting space.

During compactions, the series keys (and index entries) are already sorted
so this change uses the sorting to avoid the map and sort when writing the
index.  This reduces allocations and CPU usage quite a bit for larger cardinality
TSM files.
2017-09-11 15:26:24 -06:00
Jason Wilder 778000435a Conver all keys from string to []byte in TSM engine
This switches all the interfaces that take string series key to
take a []byte.  This eliminates many small allocations where we
convert between to two repeatedly.  Eventually, this change should
propogate futher up the stack.
2017-07-28 11:00:50 -06:00
Jason Wilder 208ef09f87 Prevent writing series keys that exceed max key size
WriteBlock was missing the check for the max series keys which allowed
series keys to be written that were larger than the 2 bytes allocated
to store their length.  When this occurred, the TSM can fail to load.
2017-05-24 13:41:09 -06:00
Jason Wilder 3c130cd39c Expose TSMWriter.Flush
Allows flushing the writer so we don't always need to close and
re-open the file handle.
2017-04-28 14:00:50 -06:00
Jason Wilder d155d37ca8 Reduce TSM write buffer
When many TSM files are being compacted, the buffers can add up fairly
quickly.
2017-04-20 12:28:42 -06:00
Mark Rushakoff 41415cf2fb Update godoc for tsm1 package 2017-01-02 07:30:18 -08:00
Jonathan A. Sternberg 8812bc8a93 Remove a double lock in the tsm1 index writer 2016-06-20 17:32:34 -05:00
Edd Robinson 40732a35d0 Merge pull request #6660 from influxdata/er-vet
Fix vet issues
2016-05-20 11:12:25 +01:00
Jason Wilder eff71cbe23 Rollover to new TSM file when max blocks exceeded
Fixes #6406
2016-05-18 15:25:55 -06:00
Edd Robinson f680ab0f0d Fix vet issues 2016-05-18 13:34:11 +01:00
Jason Wilder 27c2bc3f15 Sepearate IndexWriter from TSMIndex
Allows for future versionion of the TSMIndex as well as removing
a lot of unnecessary code.
2016-04-27 13:09:52 -06:00
Jason Wilder bb82331db7 Move TSMIndex defn to reader.go 2016-04-27 13:09:52 -06:00
Jason Wilder 873ac2715d Fix panic: runtime error: slice bounds out of range
Writing a key that exceeds the max key length could cause a panic
when reading a tsm file because the 2 bytes used for the key length
would not be enough to represent the actual key length.

The writer will now return an error if when trying to write a key
that is too large.
2016-03-30 23:44:17 -06:00
Jason Wilder 03ced4cc90 Load shards concurrently 2016-03-29 12:58:52 -06:00
Ben Johnson 6e1c1da25b reduce allocations in query execution
This commit removes some heap objects by converting them from
pointer references to non-pointers or by reusing buffers.
2016-03-22 09:51:39 -06:00
Jason Wilder 7567453c9a Ensure TSM files are fsync'd
Make sure TSM files are fsync'd when closed and also that the parent
dir is fsync'd when they are renamed.
2016-03-21 15:03:52 -06:00
Jason Wilder 8d70d65a82 Convert time.Time to int64 2016-02-25 15:15:01 -07:00
Joe LeGasse dc8ed7953d Remove custom binary-conversion functions
Also cleaned up some excess allocations, and other cruft from the code
2016-02-18 13:56:35 -05:00
Jason Wilder 6f577cfef5 Reduce allocations when compacting
Key() returned the key and the entries.  We did not always need the
entries so they would be allocated and ignored.  Added a KeyAt func
that just returns the key to avoid the unnecesary entries allocation.
2016-01-05 16:16:44 -07:00
Jason Wilder ee54a1e791 Write TSM data directly to writer
We were buffering up the data to write into byte slices to reduce
IO calls but at larger sizes, this causes memory to spike.  The TSMWriter
was switched to use a bufio.Writer internally so this byte slice buffering
is unnecessary and costly now.
2016-01-05 14:46:07 -07:00
Jason Wilder b6da176a4b Fix direct index size not calculated 2015-12-23 18:01:11 -07:00
Jason Wilder a38c95ec85 Update compactions to run concurrently
This has a few changes in it (unfortuantely).  The main change is to run compactions
concurrently.  While implementing this, a few query and performance bugs showed up that
are also fixed by this commit.
2015-12-23 18:01:11 -07:00
Jason Wilder 930174bf4d Handle calling WriteBlock with no data gracefully 2015-12-18 09:57:16 -07:00
Jason Wilder 6bc7765b88 Handle calling write with no values to TSMWriter gracefully 2015-12-18 09:52:53 -07:00
Jason Wilder 737871268b Speed up indirectIndex.UnmarshalBinary
Remove a bunch of unnecessary allocations to improve startup times.
2015-12-16 11:16:17 -07:00
Jason Wilder 45e87cdfe4 Strip checksum when returning block from ReadBytes 2015-12-16 11:16:16 -07:00
Jason Wilder 4a3037814f Add WriteBlock to TSMWriter 2015-12-16 11:16:16 -07:00
Jason Wilder 928aef04cd Split data_file.go into reader.go and writer.go 2015-12-16 11:16:16 -07:00