influxdb

Commit Graph

Author	SHA1	Message	Date
Jason Wilder	1a35c0a3fc	Fix neverending full compactions The full compaction planner could return a plan that only included one generation. If this happened, a full compaction would run on that generation producing just one generation again. The planner would then repeat the plan. This could happen if there were two generations that were both over the max TSM file size and the second one happened to be in level 3 or lower. When this situation occurs, one cpu is pegged running a full compaction continuously and the disks become very busy basically rewriting the same files over and over again. This can eventually cause disk and CPU saturation if it occurs with more than one shard. Fixes #7074	2016-09-03 17:35:14 -06:00
Jason Wilder	ded6e40d47	Remove lastPlanCheck var This causes full compactions to not run if the server is running, but after a restart they do run.	2016-07-26 12:58:25 -06:00
Jason Wilder	0264966f5c	Add index optimize planning step For larger datasets, it's possible for shards to get into a state where many large, dense TSM files exist. While the shard is still hot for writes, full compactions will skip these files since they are already fairly optimized and full compactions are expensive. If the write volume is large enough, the shard can accumulate lots of these files. When a file is in this state, it's index can contain every series which causes startup times to increase since each file must parse the full set of series keys for every file. If the number of series is high, the index can be quite large causing large amount of disk IO at startup. To fix this, a optmize compaction is run when a full compaction planning step decides there is nothing to do. The optimize compaction combines and spreads the data and series keys across all files resulting in each file containing the full series data for that shard and a subset of the total set of keys in the shard. This allows a shard to only store a series key once in the shard reducing storage size as well allows a shard to only load each key once at startup.	2016-07-14 11:32:36 -06:00
Jason Wilder	5ee20e04a8	Fix compaction level planner Large files created early in the leveled compactions could cause a shard to get into a bad state. This reworks the level planner to handle those cases as well as splits large compactions up into multiple groups to leverage more CPUs when possible.	2016-07-14 11:14:09 -06:00
Jason Wilder	1ff8ecf4fb	Add ability to disable shards Disabling a shard causes all writes and queries to a shard to return an error. This also disables compactions for the shard.	2016-05-31 10:51:54 -06:00
Jason Wilder	7d50970631	Fix continous compaction edge case The level planner would keep including the same TSM files to be recompacted even if they were already quite compacted and split across several TSM files. Fixes #6683	2016-05-25 10:36:24 -06:00
Jason Wilder	f2bcf9d9ab	Code review fixes	2016-05-18 15:25:56 -06:00
Jason Wilder	e859141b75	Speed up tests Switched the max keys test to write int64 of the same value so RLE would kick in and the file size will be smaller (84MB vs 3.8MB). Removed the chunking test which was skipped because the code will not downsize a block into smaller chunks now. Skip MaxKeys tests in various environments because it needs to write too much data to run reliably.	2016-05-18 15:25:56 -06:00
Jason Wilder	eff71cbe23	Rollover to new TSM file when max blocks exceeded Fixes #6406	2016-05-18 15:25:55 -06:00
Jason Wilder	23fc9ff748	Revert "Fix memory spike when compacting overwritten points" This reverts commit `d99c5e26f6`.	2016-05-16 09:30:34 -06:00
Jason Wilder	d99c5e26f6	Fix memory spike when compacting overwritten points If a large series contains a point that is overwritten, the compactor would load the whole series into RAM during a full compaction. If the series was large, it could cause very large RAM spikes and OOMs. The change reworks the compactor to merge blocks more incrementally similar to the fix done in #6556.	2016-05-05 22:31:30 -06:00
Jason Wilder	6042e114a1	Remove tombstoned values during compaction This will skip blocks that are fully tombstoned as well as remove points that have been removed within a block.	2016-04-27 13:09:53 -06:00
Jason Wilder	c8bd41c2d8	Remove TSM reader Keys func It's very inneficient and should never be used.	2016-04-27 13:09:52 -06:00
Jason Wilder	a789e819a3	Remove NewTSMReaderWithOptions There are two TSMIndex implementations, the directIndex and the indirectIndex. Originally, we only had the directIndex and later added the indirectIndex and NewTSMReaderWithOptions in order to allow both indexes to be used in tests and code. This has created a problem since we really only use the directIndex for writing and always use the indirectIndex for reading. This changes removes the NewTSMReaderWithOptions func so that it is no longer possible to create a TSMReader with a directIndex. This will allow a lot of the block reading code used by the directIndex to be removed and simplify maintainence. It also gives better test coverage of the code that is actually used by the TSM engine now.	2016-04-27 13:09:52 -06:00
Jason Wilder	8d70d65a82	Convert time.Time to int64	2016-02-25 15:15:01 -07:00
Mark Rushakoff	e76967efb6	Add stats to tsm1.Cache	2016-02-19 16:37:34 -08:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
runner.mei	53f7e03f72	fix TSMReader.Delete() and all unit tests is pass in the windows	2016-01-30 14:15:46 +08:00
Jason Wilder	756421ec4a	Look for fully compacted block in addition to max size during compaction Some data shapes would cause files to grow larger than the max size more quickly which resulted in them getting skipped by the full compaction planner at times. Some datasets that could make this happen are very large keys or very large numbers of keys (10M). When this happened, multiple max sized files would accumulate but the blocks would not be full. When the shard went cold for writes, these files would get recompacted down to the optimal size, but a lot of space would be wasted in the mean time.	2016-01-07 15:18:42 -07:00
Jason Wilder	b6da176a4b	Fix direct index size not calculated	2015-12-23 18:01:11 -07:00
Jason Wilder	a38c95ec85	Update compactions to run concurrently This has a few changes in it (unfortuantely). The main change is to run compactions concurrently. While implementing this, a few query and performance bugs showed up that are also fixed by this commit.	2015-12-23 18:01:11 -07:00
Jason Wilder	bb2562b2ab	Return CompactionGroups from planning	2015-12-23 18:01:11 -07:00
Jason Wilder	d0ec0a15e2	Fix wrong test data setup	2015-12-23 18:01:11 -07:00
Jason Wilder	7e97b0eafd	Fix rename temp file on windows	2015-12-18 11:57:37 -07:00
Jason Wilder	fd2a409ea3	Skip decoding blocks that are already full	2015-12-17 12:47:05 -07:00
Jason Wilder	3893bc60e1	Speed up TSM compactor Just keep the current block for each iterator in the buffers.	2015-12-16 11:16:17 -07:00
Jason Wilder	00f570441b	Convert TSMKeyIterator to return blocks	2015-12-16 11:16:17 -07:00
Jason Wilder	59a57d8f73	Convert CacheKeyIterator to return encoded blocks	2015-12-16 11:16:17 -07:00
Jason Wilder	0623648140	Add chunking support back to TSMKeyIterator Was removed when MergeIterator was deleted.	2015-12-16 11:16:17 -07:00
Jason Wilder	31b97c3fe0	Add max points per block back for CacheKeyIterator Was removed when MergeIterator was removeed.	2015-12-16 11:16:16 -07:00
Paul Dix	b192136887	Merge pull request #5058 from influxdb/pd-update-compaction-logic Update TSM compaction logic	2015-12-08 18:14:15 -05:00
Paul Dix	96445a53a7	Update TSM compaction logic * Update compaction to look at newest files of the smallest step first * Update compaction to look at older files in larger steps if newer files don't have enough small steps to compact * Changed the TestDefaultCompactionPlanner_CombineSequence test to reflect what's possible now. We'd only have multiple files in the same generation if the all files but one were over the max allowable size. * Clean up the logic on when full compactions are run and when planning can be skipped	2015-12-08 17:33:38 -05:00
Jason Wilder	9d82e24ca0	Fix performance of dropping large number of keys	2015-12-08 10:47:06 -07:00
Paul Dix	937233d988	Update TSM compaction planning logic * Update Plan to do a full compaction if cold for writes * Remove MaxFileSize as a config variable from Compactor. Should be a set constant * Update Plan to keep track of if the last check was fully compacted so we can skip future planning calls * Update compact min file count to 3 so that compactions run more frequently	2015-12-07 08:26:30 -05:00
Paul Dix	1bee7d1512	Update TSM, remove old version, add config * remove rolloverTSMFileSize constant that is no longer used * remove the maxGenerationFileCount since it is no longer a limitation that's necessary with the new compaction scheme. We no longer read WAL segments as part of the compaction so memory is only used as we read in each individual key * remove minFileCount and switch to a user configurable variable * remove the mutex from WALSegmentWriter. There's never more than one open in the WAL at one time and it's not exported through any function so the lock on the WAL should be used. This simplified keeping track of the last write time and removed a bunch of unnecessary locks. * update WALSegmentWriter.Write to take the compressed bytes so that encoding and compression can occur before the call to write (while we don't hold the WAL lock) * remove a bunch of unnecessary locking in WAL.writeToLog * Add check for TSM file magic number and vesion * Remove old tsm, log, and unused cursor code * Remove references to tsm1dev everywhere except in the inspector * Clean up config options for compaction and snapshotting * Remove old TSM configuration options * Update the config.sample.toml with TSM options * Update WAL compact to force if it has been cold for writes for a configurable period of time (1h by default)	2015-12-06 18:50:39 -05:00
Jason Wilder	6592615958	Updated compaction strategy This changes compacting files to merge sequences of files in lower generations up to later generations	2015-12-04 23:30:39 -07:00
Jason Wilder	357b88c439	Increment sequence of max generation when compaction files	2015-12-04 13:46:28 -07:00
Jason Wilder	52bec1f7f6	Change TSM file naming to generation-sequence.tsm	2015-12-04 11:51:33 -07:00
Paul Dix	bf65e967aa	Add test for compacting multiple TSM files	2015-12-03 10:36:17 -05:00
Paul Dix	b0fb8a0a27	Update TSM cache, compact, wal, encoding * Update cache to have a single slice of values for a key (removed checkpoints) * Changed compact.Plan to only worry about TSM files. * Updated Plan to not return an error since there was no case in which it would. * Update WAL to not keep stats since they're no longer needed. * Update engine to flush the Cache/WAL to a new TSM file when the min threshold is hit. * Split compact logic between TSM compacts and WAL/Cache writes. * Remove unnecessary merge iterator, wal segment iterator, and other no longer necessary stuff. * Remove the asending bool from the Dedupe method. Values should always be in ascending order. It's up to the cursor to iterate through values based on the direction. Giving the cursor responsibility makes it so we don't need to sort, dedupe or reallocate anything for different query orders. * Updated engine to use its locks to ensure writes and cache flushes don't cause a race. * Update all tests with new signatures. Removed a bunch of tests around TSM rewrites and WAL segment iteration that are no longer necessary.	2015-12-03 08:11:50 -05:00
Jason Wilder	751d1dd467	Don't rewrite TSM files while WAL segments exist This approach is not working and needs to be reworked.	2015-12-02 09:45:24 -07:00
Jason Wilder	5744f5ba02	Add ability to filter values by time when writing TSM files	2015-12-02 09:45:24 -07:00
Jason Wilder	708266da69	Cache related compaction fixes	2015-12-02 09:45:24 -07:00
Jason Wilder	231c052003	Don't limit WAL segments during compaction Since they are already loaded in the cache, this limit is not really needed anymore.	2015-12-02 09:45:24 -07:00
Jason Wilder	7e249e0555	Use CacheKeyIterator instead of WALKeyIterator during compactions	2015-12-02 09:45:24 -07:00
Jason Wilder	4a03469662	Integrate TSM compaction into dev engine	2015-12-02 09:45:23 -07:00
Jason Wilder	d4b1c25f8e	Add CompactionPlanner type CompactionPlanner is used to determine which files (WAL Segments, TSM Files) to include in a given compaction run.	2015-12-02 09:45:23 -07:00
Jason Wilder	5291fbcf39	Add TSM support to MergeIterator Enables the ability to combine multiple TSM files into one as well as merge existing TSM files with newer WAL segment values.	2015-12-02 09:45:23 -07:00
Jason Wilder	acdb6bcdf6	Add TSMKeyIterator Allows iterating of multiple TSM files in sort key and values order.	2015-12-02 09:45:23 -07:00
Jason Wilder	25206c729c	Add compactor type	2015-11-24 08:50:07 -07:00

1 2

52 Commits (e7629e966897634807a6134f6a3b4daa57ef6a0c)