influxdb

Commit Graph

Author	SHA1	Message	Date
Jason Wilder	7f96d78b79	Make encoder re-usable This allows encoders to be re-used and maintained in a pool to avoid allocating new ones on every compactions and write of an encoded block. The pool used is not a sync.Pool to ensure that the encoders will not be garbage collected.	2016-09-26 12:19:15 -06:00
Mark Rushakoff	5b549ffdfe	Handle bounds errors in UnpackBlock	2016-07-19 15:43:27 -07:00
Jason Wilder	0b481ff627	Fix pathalogical TSM query case This fixes a pathalogical query condition cause by and problematic structuring of TSM files based on how points were written. The condition can occur when there are multiple TSM files and a large number of points are written into the past. The earlier existing TSM files must also have points in the past and close to the present causing their time range to eclipse the later files. When this condition occurs, some queries can spend an excessive amount of time merge all the overlapping blocks. The fix was to constrain the window of overlapping blocks based on the first one we ran into. There was also a simple case in the Merge where we could skip the binary search path and just append the two inputs.	2016-05-25 09:14:17 -06:00
Jason Wilder	4f39cb2f97	Fix case where Merge return unsorted values	2016-05-09 15:40:34 -06:00
Jason Wilder	d99c5e26f6	Fix memory spike when compacting overwritten points If a large series contains a point that is overwritten, the compactor would load the whole series into RAM during a full compaction. If the series was large, it could cause very large RAM spikes and OOMs. The change reworks the compactor to merge blocks more incrementally similar to the fix done in #6556.	2016-05-05 22:31:30 -06:00
Jason Wilder	a0ac754802	Fix loading huge series into RAM when points are overwritten In some query scenarios, if there are a lot of points on disk spread across many blocks in TSM files and a point is overwritten near the begginning of the shard's timerange, the full series could be loaded into RAM triggering OOMs and huge allocations. The issue was that the KeyCursor code that handles overwriting points had a simple implementation that just deduped the whole series in this case. This falls over when the series is quite large. Instead, the KeyCursor has been changed to only decode blocks with updated points. It then keeps track of what section of the blocks have been read so they are not re-read when the later points are decoded. Since the points in a block are always sorted, the code was also changed to remove the Deduplicate calls since they end up reallocating the slice. Instead, we do a sorted merge and re-use the slice as much as we can.	2016-05-05 09:34:44 -06:00
Jason Wilder	97504a552c	Support time range tombstones in FileStore/KeyCursor	2016-04-27 13:09:52 -06:00
Ben Johnson	286072f65a	update dep: simple8b @ b421ab40	2016-04-22 09:46:05 -06:00
Jason Wilder	f841a90d35	Use int64 instead of time.Time in timestamp encoder/decoder	2016-04-19 10:25:27 -06:00
Ben Johnson	525e22c92b	tsm1 query engine alloc reduction This commit makes a number of performance improvements to reduce allocations during query execution. Several objects and buffers are now reused across the components to avoid allocations. Previously a simple `count(value)` query across 1M points would require 26,000+ allocations. After the changes in this commit that number has been reduced to 88.	2016-04-11 14:50:59 -06:00
Joe LeGasse	f10c300765	Update to conversion tool to work in current versions After adding type-switches to the tsm1 packages, the custom implementation found in the conversion tool broke. This change uses tsm1.NewValue() instead of a custom implementation. This change also ensures that the tsm1.Value interface can only be implemented internally to allow for the optimized type-switch based encoding	2016-03-30 13:26:46 -04:00
Joe LeGasse	344e5abd41	Changed type-switch a few places to reduce allocations. Slices of tsm1.Value interfaces are only ever used with all the same types, and the previous code would switch on the type returned from a call to Value(), which allocated and returned an interface{} object for the underlying value. This change instead type-switches on the tsm1.Value object itself, allowing it direct access to the underlying value field, eliminating the unecessary allocations.	2016-03-11 15:57:05 -05:00
Jason Wilder	8d70d65a82	Convert time.Time to int64	2016-02-25 15:15:01 -07:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
Ben Johnson	b8918a780c	integer support	2016-02-10 09:40:25 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00
INADA Naoki	80a637904d	tsm1: Use unixnano instead of time.Time	2016-02-03 10:05:40 +09:00
INADA Naoki	771253256b	FloatValue uses unixnano instead of time.Time	2016-02-03 09:57:00 +09:00
Ben Johnson	98baf078d0	tsm1 query performance improvements	2016-01-27 13:42:32 -07:00
Jason Wilder	fd2a409ea3	Skip decoding blocks that are already full	2015-12-17 12:47:05 -07:00
Jason Wilder	cf341eaa6a	Remove MinTime from blocks MinTime is not in the index for each block so storing it in the block header is redundant. The encodings also store it in their header so we are actually storing it 3 times. Removing this is an incompatible change with the current tsm1 file format.	2015-12-07 11:26:58 -07:00
Paul Dix	9637446ba9	Merge pull request #4990 from influxdb/pd-loadmetadata-wal Update TSM engine, WAL and encoding	2015-12-04 18:21:47 -05:00
Paul Dix	b0f3dcc8cc	Update TSM metadata loading and write snapshot * Update WriteSnapshot to always call synchronously * Update LoadMetadataIndex to load WAL metadata from the cache	2015-12-04 16:03:17 -05:00
Jason Wilder	c7e37766e7	Avoid repetitive index searches when iterating over cursors First pass at TSM cursor iteration ended up searching the file indexes too frequently and hurt performance. This changes that to search it once and then have the cursor hold onto the block locations to seek to. Doubles the query performance from the first iteration, but still a lot of room for improvement.	2015-12-04 10:02:59 -07:00
Paul Dix	eafb703afc	Update TSM engine, WAL and encoding * Add InfluxQLType to Values to map the TSM type to InfluxQL * Fix bug in WAL where close wouldn't nil out the currentSegment after closing it * Export writeSnapshot to be used in tests, add argument to run it async or not * Update reloadCache to load temporary metadata information in the engine * Update LoadMetadataIndex to use the temp WAL metadata information	2015-12-04 11:09:39 -05:00
Paul Dix	b0fb8a0a27	Update TSM cache, compact, wal, encoding * Update cache to have a single slice of values for a key (removed checkpoints) * Changed compact.Plan to only worry about TSM files. * Updated Plan to not return an error since there was no case in which it would. * Update WAL to not keep stats since they're no longer needed. * Update engine to flush the Cache/WAL to a new TSM file when the min threshold is hit. * Split compact logic between TSM compacts and WAL/Cache writes. * Remove unnecessary merge iterator, wal segment iterator, and other no longer necessary stuff. * Remove the asending bool from the Dedupe method. Values should always be in ascending order. It's up to the cursor to iterate through values based on the direction. Giving the cursor responsibility makes it so we don't need to sort, dedupe or reallocate anything for different query orders. * Updated engine to use its locks to ensure writes and cache flushes don't cause a race. * Update all tests with new signatures. Removed a bunch of tests around TSM rewrites and WAL segment iteration that are no longer necessary.	2015-12-03 08:11:50 -05:00
Philip O'Toole	bad0f657de	Deduplicate supports requesting sort order	2015-11-30 16:21:44 -08:00
Jason Wilder	25206c729c	Add compactor type	2015-11-24 08:50:07 -07:00
Jason Wilder	a7d7c280ed	Add block type to index This will faciliate loading a block into a type specific result without first loading the block. This will also allow us to populate the database index solely from the index.	2015-11-17 23:24:09 -07:00
Jason Wilder	e5022a898d	Support decoding into type specific slices There is a lot of allocations performed when decoding blocks. These types can be re-used to reduce allocations in many cases. This change allows a type specific slice to be passed in to decode funcs to be re-used if it is large enough. The existing decode is is left for backwards compatibility but is not very efficient right now. It may be removed.	2015-11-17 23:24:09 -07:00
Jason Wilder	5a12c49475	Make type specific decoders exported	2015-11-17 23:24:09 -07:00
Jason Wilder	d517bad6f2	Add BlockType func Allows the block type to be determined without decoding all the values.	2015-11-17 23:24:09 -07:00
Philip O'Toole	76b02c9143	Merge pull request #4812 from influxdb/checkpointed_wal_tsm_cache Checkpointed WAL tsm1 cache	2015-11-17 11:27:00 -08:00
Philip O'Toole	d8ea132c53	Add WAL cache	2015-11-16 19:52:49 -08:00
Jason Wilder	b279534f2a	Remove type specific casts in encoders This prevented the encoders from using other implementations of the Value interface because it would always cast one of the types to our specific implementations.	2015-11-16 08:44:52 -07:00
Jason Wilder	7f4a3f516b	Return error if NaN is encoded in a block	2015-10-27 17:12:56 -06:00
Ben Johnson	e9d303531e	reuse tsm1 decode buffer This commit changes `tsm1.DecodeBlock()` to reuse the same slice of `[]tsm1.Value` instead of reallocating a new one each time.	2015-10-23 12:51:55 -06:00
Ben Johnson	c27f8ae3a4	tsm1 meta lint	2015-10-15 15:03:10 -06:00
Jason Wilder	758359accc	Prevent panic in DecodeSameTypeBlock If DecodeSameTypeBlock is called on on an empty Values slice, it would panic with an index out of bounds error. This func can actually be removed because DecodeBlock can determine what type of values are encoded already. This will still panic if the block cannot be decoded due to other reasons. Fixes #4365	2015-10-09 12:52:23 -06:00
Jason Wilder	c6f2f9cec2	Avoid duplicating values slice when encoding	2015-10-05 20:09:56 -04:00
Jason Wilder	cb28dabf62	Make DecodeBlock panic if block size is too small Should never get a block size 9 bytes since Encode always returns the min timestampe and a 1 byte header. If we get this, the engine is confused.	2015-10-05 20:09:56 -04:00
Jason Wilder	b0449702e5	Fix comment typos	2015-10-05 20:09:56 -04:00
Jason Wilder	1d754db00b	Propogate all encoding errors to engine Avoid panicing in lower level code and allow the engine to decide what it should do.	2015-10-05 20:09:56 -04:00
Jason Wilder	4c54c78009	Move compression encoding constants to encoders Will make it less error-prone to add new encodings int the future since each encoder has it's set of constants. There are some placeholder contants for uncompressed encodings which are not in all encoder currently.	2015-10-05 20:09:56 -04:00
Paul Dix	594253cbba	Rename storage engine to tsm1, for Time Structured Merge Tree!	2015-10-05 20:09:55 -04:00

45 Commits (f632b41f6aae6aae750b3c323f9e660894a266b7)