influxdb

Commit Graph

Author	SHA1	Message	Date
Jason Wilder	d99c5e26f6	Fix memory spike when compacting overwritten points If a large series contains a point that is overwritten, the compactor would load the whole series into RAM during a full compaction. If the series was large, it could cause very large RAM spikes and OOMs. The change reworks the compactor to merge blocks more incrementally similar to the fix done in #6556.	2016-05-05 22:31:30 -06:00
Jason Wilder	a0ac754802	Fix loading huge series into RAM when points are overwritten In some query scenarios, if there are a lot of points on disk spread across many blocks in TSM files and a point is overwritten near the begginning of the shard's timerange, the full series could be loaded into RAM triggering OOMs and huge allocations. The issue was that the KeyCursor code that handles overwriting points had a simple implementation that just deduped the whole series in this case. This falls over when the series is quite large. Instead, the KeyCursor has been changed to only decode blocks with updated points. It then keeps track of what section of the blocks have been read so they are not re-read when the later points are decoded. Since the points in a block are always sorted, the code was also changed to remove the Deduplicate calls since they end up reallocating the slice. Instead, we do a sorted merge and re-use the slice as much as we can.	2016-05-05 09:34:44 -06:00
Jason Wilder	97504a552c	Support time range tombstones in FileStore/KeyCursor	2016-04-27 13:09:52 -06:00
Jason Wilder	a789e819a3	Remove NewTSMReaderWithOptions There are two TSMIndex implementations, the directIndex and the indirectIndex. Originally, we only had the directIndex and later added the indirectIndex and NewTSMReaderWithOptions in order to allow both indexes to be used in tests and code. This has created a problem since we really only use the directIndex for writing and always use the indirectIndex for reading. This changes removes the NewTSMReaderWithOptions func so that it is no longer possible to create a TSMReader with a directIndex. This will allow a lot of the block reading code used by the directIndex to be removed and simplify maintainence. It also gives better test coverage of the code that is actually used by the TSM engine now.	2016-04-27 13:09:52 -06:00
Ben Johnson	286072f65a	update dep: simple8b @ b421ab40	2016-04-22 09:46:05 -06:00
Ben Johnson	525e22c92b	tsm1 query engine alloc reduction This commit makes a number of performance improvements to reduce allocations during query execution. Several objects and buffers are now reused across the components to avoid allocations. Previously a simple `count(value)` query across 1M points would require 26,000+ allocations. After the changes in this commit that number has been reduced to 88.	2016-04-11 14:50:59 -06:00
Jason Wilder	9984cd5d6d	Fix skipping blocks at query time when overlaps exist Depending on how data is written across TSM files, it was possible to skip over some blocks at query time making it looks like data was missing.	2016-03-14 13:11:11 -06:00
Jason Wilder	8d70d65a82	Convert time.Time to int64	2016-02-25 15:15:01 -07:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00
Jason Wilder	756421ec4a	Look for fully compacted block in addition to max size during compaction Some data shapes would cause files to grow larger than the max size more quickly which resulted in them getting skipped by the full compaction planner at times. Some datasets that could make this happen are very large keys or very large numbers of keys (10M). When this happened, multiple max sized files would accumulate but the blocks would not be full. When the shard went cold for writes, these files would get recompacted down to the optimal size, but a lot of space would be wasted in the mean time.	2016-01-07 15:18:42 -07:00
Jason Wilder	a38c95ec85	Update compactions to run concurrently This has a few changes in it (unfortuantely). The main change is to run compactions concurrently. While implementing this, a few query and performance bugs showed up that are also fixed by this commit.	2015-12-23 18:01:11 -07:00
Jason Wilder	9d82e24ca0	Fix performance of dropping large number of keys	2015-12-08 10:47:06 -07:00
Jason Wilder	87892d79da	Dedupe points at query time if there are overlapping blocks	2015-12-07 21:10:10 -07:00
Paul Dix	1bee7d1512	Update TSM, remove old version, add config * remove rolloverTSMFileSize constant that is no longer used * remove the maxGenerationFileCount since it is no longer a limitation that's necessary with the new compaction scheme. We no longer read WAL segments as part of the compaction so memory is only used as we read in each individual key * remove minFileCount and switch to a user configurable variable * remove the mutex from WALSegmentWriter. There's never more than one open in the WAL at one time and it's not exported through any function so the lock on the WAL should be used. This simplified keeping track of the last write time and removed a bunch of unnecessary locks. * update WALSegmentWriter.Write to take the compressed bytes so that encoding and compression can occur before the call to write (while we don't hold the WAL lock) * remove a bunch of unnecessary locking in WAL.writeToLog * Add check for TSM file magic number and vesion * Remove old tsm, log, and unused cursor code * Remove references to tsm1dev everywhere except in the inspector * Clean up config options for compaction and snapshotting * Remove old TSM configuration options * Update the config.sample.toml with TSM options * Update WAL compact to force if it has been cold for writes for a configurable period of time (1h by default)	2015-12-06 18:50:39 -05:00
Jason Wilder	52bec1f7f6	Change TSM file naming to generation-sequence.tsm	2015-12-04 11:51:33 -07:00
Jason Wilder	c7e37766e7	Avoid repetitive index searches when iterating over cursors First pass at TSM cursor iteration ended up searching the file indexes too frequently and hurt performance. This changes that to search it once and then have the cursor hold onto the block locations to seek to. Doubles the query performance from the first iteration, but still a lot of room for improvement.	2015-12-04 10:02:59 -07:00
Jason Wilder	adf5c5b223	Replace Next/Prev with Scan	2015-12-03 12:39:13 -07:00
Jason Wilder	be59ba3455	Add Prev support to FileStore Allows read the previous block of values given a timestamp and key.	2015-12-03 12:39:12 -07:00
Jason Wilder	6fba01df89	Implement single field TSM queries	2015-12-03 12:35:36 -07:00
Jason Wilder	4a03469662	Integrate TSM compaction into dev engine	2015-12-02 09:45:23 -07:00
Jason Wilder	9c2be12b65	Add FileStore.Remove func Allows a TSMFile to be removed from the active set of files managed by the FileStore.	2015-11-16 09:16:10 -07:00
Jason Wilder	ef18f8afb2	Handle TSM key deletions This writes a tombstone file containing a line per deleted key. This file is read when a TSMReader is created and any keys listed in the file are removed from the index.	2015-11-16 08:44:52 -07:00
Jason Wilder	0ab423c7ff	Initial FileStore implementation Provides functionality to load a directory of TSM files (or add them manually) as well as reading blocks of values for individual key and times.	2015-11-16 08:44:52 -07:00

24 Commits (10db0aafeb8b21459a537d455afe48f8b19e22c4)