influxdb

Commit Graph

Author	SHA1	Message	Date
Jason Wilder	9b86bfea2a	Merge pull request #6582 from eleme/fix_engine_cache_size fix cache size of engine	2016-05-10 09:01:03 -06:00
Jason Wilder	8839cabd41	Add benchmark for Merge	2016-05-10 08:39:55 -06:00
Cory LaNou	4d30ea1eb3	minor PR feedback refactor	2016-05-10 08:14:51 -05:00
Cory LaNou	a3bf3e2ef1	added baseline backup/restore plumbing	2016-05-10 08:14:51 -05:00
Jason Wilder	4f39cb2f97	Fix case where Merge return unsorted values	2016-05-09 15:40:34 -06:00
Ben Johnson	078e561820	parallelize iterators	2016-05-09 10:25:30 -06:00
thbourlove	22c2e7e1c5	fix cache memory size of engine	2016-05-09 21:29:34 +08:00
Jason Wilder	d99c5e26f6	Fix memory spike when compacting overwritten points If a large series contains a point that is overwritten, the compactor would load the whole series into RAM during a full compaction. If the series was large, it could cause very large RAM spikes and OOMs. The change reworks the compactor to merge blocks more incrementally similar to the fix done in #6556.	2016-05-05 22:31:30 -06:00
Ben Johnson	4c45f8ec32	Merge pull request #6560 from benbjohnson/optimize-tsm1-call-iterator Move call iterator to series level	2016-05-05 11:13:53 -06:00
Ben Johnson	fdf34d4356	move call iterator to series level This commit moves the `CallIterator` to wrap the individual series instead of wrapping a shard. This allows individual points to be aggregated before being merged. This will cause a small increase in memory usuage per series but it shows a 20% decrease in query time when there are a moderate number of points per series.	2016-05-05 09:59:03 -06:00
Jason Wilder	a0ac754802	Fix loading huge series into RAM when points are overwritten In some query scenarios, if there are a lot of points on disk spread across many blocks in TSM files and a point is overwritten near the begginning of the shard's timerange, the full series could be loaded into RAM triggering OOMs and huge allocations. The issue was that the KeyCursor code that handles overwriting points had a simple implementation that just deduped the whole series in this case. This falls over when the series is quite large. Instead, the KeyCursor has been changed to only decode blocks with updated points. It then keeps track of what section of the blocks have been read so they are not re-read when the later points are decoded. Since the points in a block are always sorted, the code was also changed to remove the Deduplicate calls since they end up reallocating the slice. Instead, we do a sorted merge and re-use the slice as much as we can.	2016-05-05 09:34:44 -06:00
Jason Wilder	57cb3fdbc0	Merge pull request #6522 from influxdata/tp-tsm-dump Dump TSM files to line protocol	2016-05-03 10:44:33 -06:00
Jason Wilder	4196554f51	Fix overwriting points returning wrong value The cursors were returning the wrong value in the case when points existed in both the cache and tsm files with the same timestamp. The cache value should have been returned, but the tsm value was returned incorrectly. Fixes #6439	2016-05-03 09:21:31 -06:00
Edd Robinson	fd77dbe648	Merge pull request #6546 from influxdata/er-build-tag Fix invalid build tag	2016-05-03 16:00:39 +01:00
Jonathan A. Sternberg	a2a5c32770	Merge pull request #6539 from influxdata/js-6495-fix-aggregates-with-empty-shards Fix aggregate returns when data is missing from some shards	2016-05-03 10:56:21 -04:00
Jonathan A. Sternberg	d6d0addcec	Fix aggregate returns when data is missing from some shards If a shard is empty for a specific field and the field type is something other than a float, a nil iterator would get returned from one of the empty shards and cause the combined iterators to be cast to the float type and all other iterator types to be discarded (or for integers, to be cast). This is rare since most aggregates don't accept strings or booleans, but for queries like: SELECT distinct(string) FROM mydata It would result in nothing getting returned if one of the shards didn't have a value for `string`. This change modifies the query engine to return nil for the shards instead of a fake iterator and then to only use the fake iterator if the final aggregate iterator is nil (meaning that no iterators could be constructed for the field from any shard). Fixes #6495.	2016-05-03 10:41:22 -04:00
Edd Robinson	d35fa1ec97	Remove redundant windows build tags	2016-05-03 14:22:02 +01:00
Jason Wilder	e0304ae3d5	Fix shards not getting assigned to series on restart Also, simplifies the LoadMetaDataIndex func to not require a *Shard	2016-05-02 11:36:05 -06:00
Jason Wilder	2d09937fd2	Fix removing fully deleted index blocks If multiple tombstone entries happen to exist for the same key in a tombstone file, it was possible to panic. The first application would remove all index entries and the second time around the code still assumed entries would exist and would index into the nil slice. Also fixes a case where the range of time would fully delete all index entries, but it did not align with math.MinInt64 and math.MaxInt64. This would cause the index locations to still exist in the offset slice. This is inefficient because the BlockIterator would still scan and decode the block only to discover that all the values are deleted. We now just remove it from the offsets slice in this case since the range of values are deleted.	2016-05-02 11:36:05 -06:00
Jason Wilder	58aa65d5a8	Optimize applyTombstones When a large tombstone file existed on disk, this code was slow since it would apply each tombstone to the index one at a time causing the index to be scanned for each key. Instead, we group all the tombstones together by timestamp and apply in bulk so that the index in scan once for each set of tombstones. If we change to immuntable tombstone files, it might be better to just write a file where all the keys have the same tombstone so we can re-apply them efficiently.	2016-05-02 11:36:05 -06:00
Jason Wilder	c73c7cea25	Revert filtering index entries in BlockIterator This was the wrong fix. The real issue was the tombstones were being read incorrectly and also applied incorrectly at times. This code is slower and not necessary so reverting it.	2016-05-02 11:36:04 -06:00
Jason Wilder	f9ace932c0	Fix V2 tombstone reading file position Each iteration of the loop was incrementing the position by 4 incorrectly. The position should start at four since the header is 4 bytes. This caused tombstones at the end of the file to not be read because the counter was out of sync with the actual file position which cause the loop to exit early. Probably better to refactor this to check for io.EOF instead of using the counter.	2016-05-02 11:36:04 -06:00
Jason Wilder	bd1009080e	Prevent writing empty tombstone files If you delete from a measurement with a tag those does not match any series, we would write a empty tombstone file and file to load it back.	2016-05-02 11:36:04 -06:00
Jason Wilder	8082fc61ba	Fix parsing keys when loading database index The code for parsing a key our of the WAL or TSM files in the engine was naive and didn't account for measurements with escape chars. This uses the correct parsing code to parse and load them correctly. Fixes #6496	2016-04-30 14:47:19 -06:00
Todd Persen	9eb4c1ec57	Fix typo in comment.	2016-04-29 16:26:27 -07:00
Jason Wilder	abcb559b09	Remove index meta data when series and measurements are gone This remove the dropMeta param from the tsdb.Store.DeleteSeries and lets the shard determine when to remove the meta data from the index based on what series still have data in the shard. This uncovered a nasty bug in compactions where a fully deleted series would prematurely end the compactions and not carry forward the rest of the data in the TSM file. This is now fixed as well.	2016-04-29 16:31:57 -06:00
Jason Wilder	4e353867d5	Fix first block not getting purged when deleting series	2016-04-27 17:08:00 -06:00
Ben Johnson	f7af787aef	add DELETE query support This commit adds query language support for deleting series with a `DELETE` query.	2016-04-27 15:16:23 -06:00
Jason Wilder	aefd2ad08b	Add DeleteSeries and DeleteSeriesRange	2016-04-27 13:09:53 -06:00
Jason Wilder	c306090361	Fix tombstone rename on windows	2016-04-27 13:09:53 -06:00
Jason Wilder	86d37614e4	Remove debugging from test output	2016-04-27 13:09:53 -06:00
Jason Wilder	bf3aa5857d	Don't add tombstone for timerange not contained by file	2016-04-27 13:09:53 -06:00
Jason Wilder	6042e114a1	Remove tombstoned values during compaction This will skip blocks that are fully tombstoned as well as remove points that have been removed within a block.	2016-04-27 13:09:53 -06:00
Jason Wilder	23bbfb2192	Prevent truncated WAL entries from panicing	2016-04-27 13:09:53 -06:00
Jason Wilder	0de21ade40	Add delete range of values support to WAL and cache loader	2016-04-27 13:09:53 -06:00
Jason Wilder	d13d01b516	Allow deleting series by time on a shard	2016-04-27 13:09:53 -06:00
Jason Wilder	4d71d2b01f	Add support for deleting cache values using time range	2016-04-27 13:09:52 -06:00
Jason Wilder	c154cd4b4a	Remove TSMReaderOptions Not used	2016-04-27 13:09:52 -06:00
Jason Wilder	c8bd41c2d8	Remove TSM reader Keys func It's very inneficient and should never be used.	2016-04-27 13:09:52 -06:00
Jason Wilder	7e06d558d5	Update ContainsValue to handle tombstones	2016-04-27 13:09:52 -06:00
Jason Wilder	97504a552c	Support time range tombstones in FileStore/KeyCursor	2016-04-27 13:09:52 -06:00
Jason Wilder	27c2bc3f15	Sepearate IndexWriter from TSMIndex Allows for future versionion of the TSMIndex as well as removing a lot of unnecessary code.	2016-04-27 13:09:52 -06:00
Jason Wilder	bb82331db7	Move TSMIndex defn to reader.go	2016-04-27 13:09:52 -06:00
Jason Wilder	1ac0b01c5a	Remove fileAccessor No longer used	2016-04-27 13:09:52 -06:00
Jason Wilder	a789e819a3	Remove NewTSMReaderWithOptions There are two TSMIndex implementations, the directIndex and the indirectIndex. Originally, we only had the directIndex and later added the indirectIndex and NewTSMReaderWithOptions in order to allow both indexes to be used in tests and code. This has created a problem since we really only use the directIndex for writing and always use the indirectIndex for reading. This changes removes the NewTSMReaderWithOptions func so that it is no longer possible to create a TSMReader with a directIndex. This will allow a lot of the block reading code used by the directIndex to be removed and simplify maintainence. It also gives better test coverage of the code that is actually used by the TSM engine now.	2016-04-27 13:09:52 -06:00
Jason Wilder	bc6328d196	Add time range support to tombstone files This adds support for a time range to tombstone files to allow a subset of points to be deleted instead of the whole series. It changes the tombstone file format to a binary format and maintains backwards compatibility with the old text format tombstone files.	2016-04-27 13:09:52 -06:00
Ben Johnson	286072f65a	update dep: simple8b @ b421ab40	2016-04-22 09:46:05 -06:00
Ben Johnson	d204a8b683	optimize tsm1.FloatDecoder This commit changes the `FloatDecoder.val` from a `float64` type to a `uint64` to avoid an additional type conversion during read. Now the type gets converted to a `float64` only on call to `Values()`.	2016-04-21 08:49:12 -06:00
Jason Wilder	87ceb7426a	Don't lock the cache while adding entries Entries have their own locking so the cache doesn't need to be lock when adding to them.	2016-04-20 16:08:58 -06:00
Jason Wilder	fbaa7db54f	Don't lock entry when scanning new values to add	2016-04-20 16:00:26 -06:00
Jason Wilder	bfa225f149	Merge pull request #6430 from influxdata/jw-cache-load-size Disable cache max memory size when reloading the cache	2016-04-20 14:35:23 -06:00
Stephen Gutekanst	9dc09c5257	Make logging output location more programmatically configurable (#6213 ) This has various benefits: - Users embedding InfluxDB within other Go programs can specify a different logger / prefix easily. - More consistent with code used elsewhere in InfluxDB (e.g. services, other `run.Server.` fields, etc). - This is also more efficient, because it means `executeQuery` no longer allocates a single `log.Logger` each time it is called.	2016-04-20 21:07:08 +01:00
Jason Wilder	f679787080	Disable cache max memory size when reloading the cache The cache max memory size is an approximate size and can prevent a shard from loading at startup. This change disable the max size at startup to prevent this problem and sets the limt back after reloading. Fixes #6109	2016-04-20 10:41:30 -06:00
Jonathan A. Sternberg	c8c38e15cd	Merge pull request #6386 from influxdata/js-iterator-next-error Modify all of the iterators to allow returning an error on Next()	2016-04-20 10:39:53 -04:00
Ben Johnson	54454e1e5b	Merge pull request #6424 from benbjohnson/optimize-bit-reader Optimize tsm1.BitReader	2016-04-20 08:28:24 -06:00
Seif Lotfy	c6e3c87e00	Add Block checksum validation and "influx_inspect verify" tool Fixes #5502	2016-04-19 22:33:03 +02:00
Ben Johnson	1d2238c642	optimize tsm1.BitReader This commit rewrites the `tsm1.BitReader` to use an 8-byte buffer instead of a 1-byte buffer and provide an inlineable fast bit read.	2016-04-19 11:34:17 -06:00
Jason Wilder	f841a90d35	Use int64 instead of time.Time in timestamp encoder/decoder	2016-04-19 10:25:27 -06:00
Jason Wilder	61beeca426	Update timestamp benchmarks	2016-04-19 10:17:32 -06:00
Jonathan A. Sternberg	7ec2a991d5	Modify all of the iterators to allow returning an error on Next() This also switches the remaining iterators to be lazy so they can return errors properly. They needed to be converted to lazy initialization anyway, which has the side effect of making it much easier for us to propagate the underlying error during initialization. Updated the Emitter to return errors when it cannot read properly from the iterators.	2016-04-18 11:17:55 -04:00
Jonathan A. Sternberg	93745d9693	Merge pull request #6391 from influxdata/js-5553-limit-queries-slow-with-group-by Propagate the limit option to the low level iterators	2016-04-16 09:39:25 -04:00
Jonathan A. Sternberg	bd5fdd797d	Propagate the limit option to the low level iterators When a GROUP BY or multiple sources are used, the top level limit iterator requires reading the entire iterator stream so it can find all of the tag groups it needs to return. For large data series, this ends up with the limit iterator discarding a lot of output. This change adds a new lower level limit iterator on each series itself so that there are fewer data points that have to be thrown away by the top level iterator. Fixes #5553.	2016-04-15 18:23:54 -04:00
Jonathan A. Sternberg	835d08591e	Do not filter out empty tags from series keys	2016-04-13 09:15:57 -04:00
Jonathan A. Sternberg	60282cf52d	Merge pull request #6284 from influxdata/js-3371-where-clause-compare-tags-and-fields Enhance comparing tags and fields in the where clause	2016-04-12 11:45:54 -04:00
Pierre Fersing	29b19a2293	Fix deadlock in tsm1/file_store	2016-04-12 09:39:21 +02:00
Jonathan A. Sternberg	ea6262b712	Enhance comparing tags and fields in the where clause Now it is possible to compare tags and fields and it is also now possible to compare tags and tags. Previously, it was only possible to compare fields with fields and tags with a string or a regex. Fixes #3371.	2016-04-11 18:10:08 -04:00
Ben Johnson	525e22c92b	tsm1 query engine alloc reduction This commit makes a number of performance improvements to reduce allocations during query execution. Several objects and buffers are now reused across the components to avoid allocations. Previously a simple `count(value)` query across 1M points would require 26,000+ allocations. After the changes in this commit that number has been reduced to 88.	2016-04-11 14:50:59 -06:00
Jonathan A. Sternberg	028fdaff81	Merge pull request #6222 from influxdata/js-6206-descending-tsm1-iterators Handle nil values from the tsm1 cursor correctly	2016-04-06 10:05:20 -04:00
Jonathan A. Sternberg	94ec92d669	Handle nil values from the tsm1 cursor correctly Send nil values from the tsm1 cursor at the end of the cursor. After the cursor reached tsm1, the `nextAt()` call would always return the default value rather than a nil value. Descending also didn't work correctly because the seeking functionality for tsm1 iterators would always act like they were ascending instead of descending when choosing which value to select. This resulted in very strange output from the emitter since it couldn't figure out if it was ascending or descending. Fixes #6206.	2016-04-06 09:27:02 -04:00
Jason Wilder	3f4c5a5585	Fix race on measurementFields Both Shard and Engine had the same reference to the measurementField map, but they each protected it with their own locks. This causes a race when write and queries are occurring because writes can add new fields to the map while queries are reading from it. The fix moves the ownership to the Engine and provides protected accessors to that Shard now users. For the most parts, the access on shard were old dead code. Fixing the measurementFields map race created a new race on the internal fields map. This is now unexported and protected via MeasurementFields exported funcs. Fixes #6188	2016-04-01 18:57:01 -06:00
Jason Wilder	873ac2715d	Fix panic: runtime error: slice bounds out of range Writing a key that exceeds the max key length could cause a panic when reading a tsm file because the 2 bytes used for the key length would not be enough to represent the actual key length. The writer will now return an error if when trying to write a key that is too large.	2016-03-30 23:44:17 -06:00
Jonathan A. Sternberg	711a6614e6	Implement the point limit monitor Fixes #6077.	2016-03-30 16:08:56 -04:00
Joe LeGasse	f10c300765	Update to conversion tool to work in current versions After adding type-switches to the tsm1 packages, the custom implementation found in the conversion tool broke. This change uses tsm1.NewValue() instead of a custom implementation. This change also ensures that the tsm1.Value interface can only be implemented internally to allow for the optimized type-switch based encoding	2016-03-30 13:26:46 -04:00
Jason Wilder	60c3898577	Add godoc for KeyAt func	2016-03-29 12:59:26 -06:00
Jason Wilder	1b08e2dd55	Use walk func to load all tsm keys to index Avoids allocating a big map or all keys.	2016-03-29 12:59:26 -06:00
Jason Wilder	d4757ad040	Remove sync.Pool from wal UnmarshalBinary When loading many shards concurrently they block trying to acquire a write lock in the sync pool adding a new source of contention. Since this code flow always needs to allocate a buffer it's not really buying us much.	2016-03-29 12:59:26 -06:00
Jason Wilder	03ced4cc90	Load shards concurrently	2016-03-29 12:58:52 -06:00
Ben Johnson	45f1c28adb	add tsm iterator stats buffer This commit adds a buffer for stats to be updated without requiring a mutex lock/unlock on every point. The tradeoff is that stats are not exactly precise. This works for our use case because stats are only periodically checked.	2016-03-23 12:23:22 -06:00
Jonathan A. Sternberg	a35d9602cd	Fix where filters when a OR is used and when a tag does not exist If an OR was used, merging filters between different expressions would not work correctly. If one of the sides had a set of series ids with a condition and the other side had no series ids associated with the expression, all of the series from the side with a condition would have the condition ignored. Instead of defaulting a non-existant series filter to true, it should just be false and the evaluation of the one side that does exist should take care of determining if the series id should be included or not. The AND condition used false correctly so did not have to be changed. If a tag did not exist and `!=` or `!~` were used, it would return false even though the neither a field or a tag equaled those values. This has now been modified to correctly return the correct series ids and the correct condition. Also fixed a panic that would occur when a tag caused a field access to become unnecessary. The filter using the field access still got created and used even though it was unnecessary, resulting in an attempted access to a non-initialized map. Fixes #5152 and a bunch of other miscellaneous issues.	2016-03-22 12:19:06 -04:00
Ben Johnson	6e1c1da25b	reduce allocations in query execution This commit removes some heap objects by converting them from pointer references to non-pointers or by reusing buffers.	2016-03-22 09:51:39 -06:00
Jonathan A. Sternberg	ad96207868	Fix ORDER BY desc so it doesn't skip values After reading the initial buffer, ORDER BY desc would read the next block into the buffer and only read the first element. It's because the code that was copied from the ascending cursor wasn't modified correctly to set the position to the last element in the buffer. The buffer size has also been lowered from 1000 to 10 to match with the ascending cursor for performance with limit queries. Fixes #6055.	2016-03-22 09:40:11 -04:00
Ben Johnson	7156c1f9bd	add IteratorStats This commit adds an `IteratorStats` that holds aggregate iterator processing information. A method is also added to `Iterator` to return the stats: Stats() influxql.IteratorStats The remote iterators will also emit their stats in the point stream upon first connection, on a given interval, and then finally once the last point has been sent.	2016-03-21 16:25:19 -06:00
Jason Wilder	ee2f21e76f	Merge pull request #6082 from influxdata/jw-tsm Fix partially written TSM files	2016-03-21 15:42:27 -06:00
Jason Wilder	7567453c9a	Ensure TSM files are fsync'd Make sure TSM files are fsync'd when closed and also that the parent dir is fsync'd when they are renamed.	2016-03-21 15:03:52 -06:00
Jason Wilder	a4e5446ddd	Return error when TSM writer close returns one The TSM writer uses a bufio.Writer that needs to be flushed before it's closed. If the flush fails for some reason, the error is not handled by the defer and the compactor continues on as if all is good. This can create files with truncated indexes or zero-length TSM files. Fixes #5889	2016-03-21 15:00:36 -06:00
Jonathan A. Sternberg	6655ca7769	Create a new interrupt iterator that will stop emitting points after an interrupt Use of the iterator is spread out into both `IteratorCreators` and inside of the iterators themselves. Part of the interrupt must be handled inside of the engine so it stops trying to emit points when an interrupt is found and another part of the interrupt has to happen when combining the iterators so it doesn't just start reading the next shard.	2016-03-21 12:07:07 -04:00
Jason Wilder	3fd40d48a1	Merge pull request #6006 from influxdata/jw-deadlock Fix deadlock when running backup	2016-03-14 13:36:45 -06:00
Jason Wilder	9984cd5d6d	Fix skipping blocks at query time when overlaps exist Depending on how data is written across TSM files, it was possible to skip over some blocks at query time making it looks like data was missing.	2016-03-14 13:11:11 -06:00
Jason Wilder	000459e350	Fix deadlock when running backup A deadlock occurs under write load if a backup is run in between the time when a snapshot compactions has snapshotted the cache and successfully written it to disk. The issus is that the second snapshot call will block on the commit lock while it is holding the engine write lock. This causes all writes to block as well as prevents the currently runnign snapshot compaction from completing because it needs to acquire a read-lock. This PR removes the commit lock and just returns an error if a snapshot is in progress to all any locks being held to be released. The caller can determine whether to retry or giveup.	2016-03-14 12:36:48 -06:00
Joe LeGasse	344e5abd41	Changed type-switch a few places to reduce allocations. Slices of tsm1.Value interfaces are only ever used with all the same types, and the previous code would switch on the type returned from a call to Value(), which allocated and returned an interface{} object for the underlying value. This change instead type-switches on the tsm1.Value object itself, allowing it direct access to the underlying value field, eliminating the unecessary allocations.	2016-03-11 15:57:05 -05:00
Jason Wilder	992c78ee22	Remove period shard maintenance goroutine This is no longer used in tsm and just peridocially locks everything for no reason now.	2016-03-09 17:31:02 -07:00
Edd Robinson	58c03448aa	Merge pull request #5514 from influxdata/er-engine-panic Ensure shards and engine are safely closed	2016-03-09 18:56:36 +00:00
Jason Wilder	e3fef5593c	Merge pull request #5855 from jonseymour/jss-5854-go-master-breaks-build fix tests to cope with future changes to testing.quick.Check - see #5854	2016-03-01 19:03:21 -07:00
Mark Rushakoff	cdcb079769	Tag TSM stats with database, retention policy ... by extracting the db/rp from the given path. Now that the code has "standardized" on extracting db/rp this way, the ShardLocation struct is no longer necessary and thus has been removed. We're back on the previous style of passing the path and walPath to NewShard.	2016-02-29 09:17:34 -08:00
Jon Seymour	73b3a2a056	Merge #5855 (issue: #5854 ). RHS merges cleanly with 0.10.0 Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-29 20:37:32 +11:00
Jon Seymour	716cdd7f41	tsm: modify encoding tests to deal with possible nil slices from testing.quick.Check in go master The current go compiler at the tip of the go master (1d5001af) has a modified implementation of testing.quick.Check that now generates nil slices as test data. (See: https://gophers.slack.com/archives/general/p14567053570110). The existing tests expect round tripping to work in this case but it does not. So, in these cases we change the expectation to reflect actual behaviour. This needs to be checked for reasonableness.	2016-02-29 20:36:19 +11:00
Jason Wilder	8d70d65a82	Convert time.Time to int64	2016-02-25 15:15:01 -07:00
Jon Seymour	11123d2694	Merge #5833 (issue: #5832 ). Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-26 07:59:03 +11:00
Jon Seymour	2c7cd06b99	tsm: cache: need to check that snapshot has been sorted. Previously, the for loop at the end of the method assumed that all entries had been deduplicated, including the entry discovered in the snapshot. However, this wasn't actually true. With this change, we make it true. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-26 07:56:25 +11:00
Jon Seymour	7eabae68de	tsm: cache: add a test for the write sequence {6,1,snapshot,7,2} Consider the write sequence: 6,1,snapshot,7,2. The hot cache gets deduplicated, so is 2,7. Now consider the test if 1 >= 2, this is false, so needSort is not set to true. The problem is the implicit assumption that the snapshot is always sorted by the time that merged() runs, but this may not be true. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-26 07:43:50 +11:00
Jason Wilder	6ebc192298	Merge pull request #5678 from jonseymour/typo doc: typographical, spelling, grammar, word-choice and phrasing improvements.	2016-02-25 09:33:41 -07:00
Jason Wilder	daf68dbbd2	Merge pull request #5701 from jonseymour/js-deduplicate-safety tsm: cache: improve thread safety of Cache.Deduplicate (see #5699)	2016-02-25 09:18:10 -07:00
Jon Seymour	4d98a1cf28	tsm: cache: remove unnecessary lock escalation. Previously, we needed a write lock on the cache because it was the only lock we had available to guard updates to entry.values and entry.needSort. However, now we have a entry-scoped lock for this purpose, we don't need the cache write lock for this purpose. Since merged() doesn't modify the .store or the c.snapshot.sort, there is no need for a write lock on the cache to protect the cache. So, we don't need to escalate here - we simply rely on the entry lock to protect the entries we are iterating over. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-26 01:31:54 +11:00
Jason Wilder	452d77cbaf	tsm: cache: introduce entry locks. Based on @jwilder's alternative to the 'dirty' slice that featured in previous iterations of this fix. Suggested-by: Jason Wilder <jason@influxdb.com> Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-26 00:05:38 +11:00
Jon Seymour	eb7eec078d	tsm: cache: introduce commit lock to Cache Currently two compactors can execute Engine.WriteSnapshot at once. This isn't thread safe since both threads want to make modifications to Cache.snapshot at the same time. This commit introduces a lock which is acquired during Snapshot() and released during ClearSnapshot(), ensuring that at most one thread executes within Engine.WriteSnapshot() at once. To ensure that we always release this lock, but only release the snapshot resources on a successful commit, we modify ClearSnapshot() to accept a boolean which indicates whether the write was successful or not and guarantee to call this function if Snapshot() has been called. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-25 12:10:37 +11:00
Jon Seymour	45d025db99	tsm: cache: add a tests to demonstrate thread safety vulnerabilities There are two tests that show two different one vulnerability. One test shows that Cache.Deduplicate modifies entries in a snapshot's store without a lock while cache readers are deduplicating those same entries while correctly locked. A second test shows that two threads trying to execute the methods that Engine.WriteSnapshot calls will cause concurrent, unsynchronized mutating access to the snapshot's store and entries. The tests fail at this commit and are fixed by subsequent commits. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-25 12:10:31 +11:00
Jon Seymour	d7d81f79da	tsm: cache: add a test that demonstrates concurrent reads are safe Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-25 12:06:10 +11:00
Mark Rushakoff	fb83374389	Track stats for number of series, measurements Per database: track number of series and measurements Per measurement: track number of series	2016-02-24 08:10:16 -08:00
Jon Seymour	530b86ba7d	tsm: cache: restore the semantics of cachedBytes and memSize stats Fixes #5805. This commit undoes a regression introduced by #5789. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-24 06:16:46 +11:00
Jon Seymour	3475356dc9	tsm: cache: fix semantics of snapshotCount statistic to make it useful. Fix for #5804. The commit for #5789 rendered the semantics of snapshotCount statistic useless. This commit restores semantics that have diagnostic value to this statistic. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-24 06:13:54 +11:00
Jason Wilder	017c24c98e	Simplify cache snapshotting The Cache had support for taking multiple snapshots to support writing multiple snapshots to TSM files concurrently if that happened to be a bottleneck. In practice, this is never a bottleneck and we only run one snappshoting goroutine continously per shard which has worked well for all workloads. The multiple snapshot support introduces some unhandled failure scenarios where wal segments could be removed without writing them to TSM files. If a snapshot compaction fails to write due to transient disk errors, subsequent snapshots will continue, but the failed one will not be retried. When the subsequent ones succeeded, all closed wal segments are removed causing data loss. This change simplifies the snapshotting capability to ensure that there is only ever one snapshot. If one fails, the next snapshot will update the existing snapshot and retry all of old and new data. Fixes #5686	2016-02-23 09:38:51 -07:00
Jonathan A. Sternberg	50753de032	Merge pull request #5782 from influxdata/js-5777-audit-panics-in-influxql Remove the non-unreachable panics in the new query engine	2016-02-22 17:18:57 -05:00
Mark Rushakoff	191de2670c	Fix non-compiling test	2016-02-22 13:49:11 -08:00
Mark Rushakoff	fc5c8597ab	Merge pull request #5758 from influxdata/mr-disk-stats Track cache, WAL, filestore stats within tsm1 engine	2016-02-22 13:01:55 -08:00
Jason Wilder	aa2e878019	Fix cache not deduplicating points in some cases The cache had some incorrect logic for determine when a series needed to be deduplicated. The logic was checking for unsorted points and not considering duplicate points. This would manifest itself as many points (duplicate) points being returned from the cache and after a snapshot compaction run, the points would disappear because snapshot compaction always deduplicates and sorts the points. Added a test that reproduces the issue. Fixes #5719	2016-02-22 13:24:42 -07:00
Jonathan A. Sternberg	7a03df2af1	Remove the non-unreachable panics in the new query engine The only panics left are ones that should be unreachable unless there is a bug. Fixes #5777.	2016-02-22 12:52:43 -05:00
Jon Seymour	c93da21a61	tsm: cache: only use NewCache for engine cache's snapshots use a simpler constructor The intent of this change is to avoid writing caches created for snapshot cache instances into the tsm1_cache measurement. We can do this by avoiding use of the NewCache constructor. All other methods are only intended to be called from on the engine cache - never on a snapshot. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-22 15:17:43 +11:00
Jon Seymour	510ee2c790	tsm: cache: during writes, update the memSize statistic outside the lock Since we are not locking but relying on atomic arithmetic, use Add rather than Set. Will also result in slightly less garbage being created. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-22 08:26:35 +11:00
Jon Seymour	9c6efe99f1	tsm: cache: ensure all statistics are initialised on cache creation. The intent of this change is to ensure that all statistic fields of the resulting tsm1_cache measurement are initialized on initialization of the cache. That way, any consumer of those measurements doesn't have to deal with the null case. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-21 15:33:50 +11:00
Jon Seymour	6697c721fb	tsm: cache: add cache throughput related statistics. Complementing and extending the changes in #5758. Add 2 level statistics: * snapshotCount * cacheAgeMs Add 2 counter statistics * cachedBytes * WALCompactionTimeMs snapshotCount can be used to measure transient write errors that are causing snapshots to accumulate cacheAgeMs can be used to guage the level of write activity into the cache The differences between cachedBytes stats sampled at different times can be used to calculate cache throughput rates The ratio (cachedBytes-diskBytes)/WALCompactionTimeMs can be used calculate WAL compaction throughput. The ratio of difference between first and last WAL compaction time over the interval length is an estimate of percentage of cache throughput consumed. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-20 22:18:57 +11:00
Mark Rushakoff	602043e11b	Add disk stats for FileStore	2016-02-19 16:37:34 -08:00
Mark Rushakoff	d99c09cedd	Add stats for current and old WAL segment sizes	2016-02-19 16:37:34 -08:00
Mark Rushakoff	e76967efb6	Add stats to tsm1.Cache	2016-02-19 16:37:34 -08:00
Joe LeGasse	dc8ed7953d	Remove custom binary-conversion functions Also cleaned up some excess allocations, and other cruft from the code	2016-02-18 13:56:35 -05:00
Ben Johnson	f7e04abef7	remove NaN from query engine This commit removes `math.NaN` returns from float iterators.	2016-02-17 14:11:31 -07:00
Jon Seymour	ab702eb44a	doc: remove the implication that the wal directory is inside the shard directory. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-15 05:33:22 +11:00
Jon Seymour	ed0a112f8e	doc: Add an Errata section intended to capture clarifications prior to full revisions of the text. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-15 00:29:02 +11:00
Jon Seymour	5e563d53c1	doc: revise discussion about cache design The description of the cache design was out of date - reflecting an older design based on checkpoints and evictions. This revision updates the design to describe snapshots and also clarify that if compaction performance falls behind the inbound write rate then writes will fail. Updates based in part of clarifications provided by Jason Wilder. See https://goo.gl/L7AzVu Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-15 00:29:02 +11:00
Jon Seymour	cdc7e28338	doc: rephrasing of how sets of SeriesIterators are generated. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-15 00:29:02 +11:00
Jon Seymour	58d1b7223a	doc: refine TSM file system layout description Minor improvements to phrasing to use the English word 'directory' and slight improvements to grammar.	2016-02-15 00:29:02 +11:00
Jon Seymour	285e0ad17a	doc: refine description of the conclusion of the compaction process. Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-15 00:29:02 +11:00
Jon Seymour	008af05f7b	doc: various grammar/word-choice improvements in TSM design document Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-15 00:29:02 +11:00
Jon Seymour	88598f78dc	doc: fix up some spelling errors/typos in .MD files Signed-off-by: Jon Seymour <jon@wildducktheories.com>	2016-02-15 00:29:02 +11:00
Jason Wilder	0ce6dd1304	Fix panic: runtime error: index out of range There was a fix in 5b1791, but is not present in the current branch likely due to a rebase issue. The current code panics with a query like: select value from cpu group by host order by time desc limit 1 This fixes the panic as well as prevents #5193 from re-occurring. The issue is that agressively closing the cursors clears out the seeks slice so re-seeking will fail.	2016-02-10 14:00:58 -07:00
Ben Johnson	d9a6a7340f	add canonical paths	2016-02-10 11:30:52 -07:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
Jonathan A. Sternberg	d1f7c445e7	Modify iterators to work across shards Aux iterators now ask the iterator creator what series will be returned and determine which aux fields to create based on the results. The `tsdb.Shards` struct also creates a call iterator around the iterators returned from each shard.	2016-02-10 09:40:29 -07:00
Jonathan A. Sternberg	c2d1206177	Implement the fill iterator Fill requires an additional function for IteratorCreator to retrieve the series that will be returned from the iterator. When fill is required for an aggregate, the IteratorCreator will be asked what series will be returned by the created iterator.	2016-02-10 09:40:29 -07:00
Ben Johnson	6204350d65	fix math operations	2016-02-10 09:40:27 -07:00
Ben Johnson	b4cb770a7f	refactor aux iterators	2016-02-10 09:40:27 -07:00
Ben Johnson	b8918a780c	integer support	2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg	583477064c	Check for `tsdb.EOF` when looking for the lowest timestamp of aux fields	2016-02-10 09:40:25 -07:00
Jonathan A. Sternberg	34f14424dd	Filter tags from the condition when building cursors on tsm1	2016-02-10 09:40:25 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00
Ben Johnson	cde973f409	refactor query engine	2016-02-10 09:40:24 -07:00
Jason Wilder	2b3c640695	Fix reading too far in fileAccess.readBytes Fixes #5566	2016-02-08 09:08:57 -07:00
Jason Wilder	28ae8b6fe0	Merge pull request #5434 from runner-mei/tsm_tombstone_windows fix TSMReader.Delete() and all unit tests is pass in the windows	2016-02-04 16:27:26 -07:00
Jason Wilder	b635e516e5	Merge pull request #5485 from runner-mei/patch-7 fix munmap bug in the windows	2016-02-04 13:47:51 -07:00
Jason Wilder	5a124e0e0b	Merge pull request #5431 from runner-mei/patch-5 fix determine the file size	2016-02-04 10:24:05 -07:00
Edd Robinson	1bcb1d033f	Allow Close to be called multiple times safely	2016-02-03 10:20:22 +00:00
INADA Naoki	80a637904d	tsm1: Use unixnano instead of time.Time	2016-02-03 10:05:40 +09:00
INADA Naoki	771253256b	FloatValue uses unixnano instead of time.Time	2016-02-03 09:57:00 +09:00
INADA Naoki	898babf616	add float bench	2016-02-03 03:12:16 +09:00
runner.mei	4ca47103b1	fix TSMReader.Delete() and all unit tests is pass in the windows	2016-01-31 11:32:08 +08:00
runner	bc992fea5e	fix munmap bug in the windows fix munmap bug in the windows fix munmap bug in the windows fix munmap bug in the windows fix munmap bug in the windows	2016-01-31 10:46:46 +08:00
runner	4b7fe70cd3	fix determine the file size fix determine the file size	2016-01-30 14:16:53 +08:00
runner.mei	53f7e03f72	fix TSMReader.Delete() and all unit tests is pass in the windows	2016-01-30 14:15:46 +08:00
Jason Wilder	924275b337	Fix panic preventing wal file truncation Fixes #5455	2016-01-28 21:50:51 -07:00
Jason Wilder	9528c3ea70	Merge pull request #5465 from influxdata/jw-remote-writes Optimize remote writes	2016-01-27 15:47:02 -07:00
Jason Wilder	1d165d38a9	Optimize Cache entry.add This reduces some of the lock contention when writing to the cache. When a new entry is created, it avoids an allocation. It also skips a check to see if we need to sorted if we already know it needs to sorted.	2016-01-27 14:26:42 -07:00
Ben Johnson	98baf078d0	tsm1 query performance improvements	2016-01-27 13:42:32 -07:00
Jason Wilder	372302bcbd	Reduce lock contention in Cache.WriteMulti A write-lock was taken the whole time, but we only need the write lock at the end.	2016-01-25 16:48:34 -07:00
Jason Wilder	5bee8880db	Reduce lock content in engine.WritePoints Writing the snapshot would deduplicate the snapshot points while still holding the engine write-lock. This can be expensive under high load and cause writes to back up and OOM the server. Instead, grab the snapshot under the lock and dedup it after releasing the lock. Possible fix for #5442	2016-01-25 15:37:34 -07:00
Jason Wilder	24f1bcfd20	Remove Dev prefix from tsm engine/tx	2016-01-10 16:43:36 -07:00
Jason Wilder	5b179113fc	Don't close tsm cursor prematurely We were closing the cursor when we read the last block which caused the internal state to be cleared. In a group by query, we seeked multiple times so depending on the group by interval and how the data was laid out in the blocks, we woudl close the cursor and the last block would get skipped. Fixes #5193	2016-01-10 15:26:01 -07:00
Jason Wilder	3c45015311	Remove MAP_POPULATE This may be causing slow restart times for systems with many large TSM files. What I believe is happening at startup in these cases is that multiple goroutines are started to load each TSM file concurrently. The kernel appears to serialize mmap calls from the same process so all of the goroutines end up getting blocked on the actual mmap system call. MAP_POPULATE instruct the kernel to pre-fault the page table for the files and triggers read-ahead of the pages. For larger, 2GB files, this makes the mmap call more expensive and slower. When there are many of these files and calls it is possible to fill all available memory with pagecache. In this case, the OS will end up pre-faulting pages from one file and have to remove pages that it just loaded from another files causing slowness. MAP_POPULATE may also be cause much more data to be pre-faulted than necessary. To load a file, we just need to scan the index at the end of the file. MAP_POPULATE is likely causing the whole file to be loaded when it won't actually be accessed for a while (or at all). Might fix issue #5311.	2016-01-08 08:45:27 -07:00
Jason Wilder	756421ec4a	Look for fully compacted block in addition to max size during compaction Some data shapes would cause files to grow larger than the max size more quickly which resulted in them getting skipped by the full compaction planner at times. Some datasets that could make this happen are very large keys or very large numbers of keys (10M). When this happened, multiple max sized files would accumulate but the blocks would not be full. When the shard went cold for writes, these files would get recompacted down to the optimal size, but a lot of space would be wasted in the mean time.	2016-01-07 15:18:42 -07:00
Jason Wilder	faf8ee17fa	Fix typo	2016-01-06 12:53:04 -07:00
Jason Wilder	d2b7c03175	Re-use the series key Avoid allocating the string twice.	2016-01-06 12:52:13 -07:00
Jason Wilder	2f7a0090c1	Don't allocate a pre-sized buffer for each cursor This is contributing to some of the high memory usage on queries and possibly some OOMs. This is slightly slower, but removing it allows some fairly large count queries over 5M series to complete instead of crashing the process using tsm1 engine.	2016-01-06 10:50:38 -07:00
Jason Wilder	6f577cfef5	Reduce allocations when compacting Key() returned the key and the entries. We did not always need the entries so they would be allocated and ignored. Added a KeyAt func that just returns the key to avoid the unnecesary entries allocation.	2016-01-05 16:16:44 -07:00
Jason Wilder	9a9ccab560	Reduce allocation in wal encoder Use sync.Pool for some temporary buffers used while encoding instead of allocatin new ones each time. Also increased the default buffer size which might be too small. Probably need to make this a config var.	2016-01-05 16:12:25 -07:00
Jason Wilder	ee54a1e791	Write TSM data directly to writer We were buffering up the data to write into byte slices to reduce IO calls but at larger sizes, this causes memory to spike. The TSMWriter was switched to use a bufio.Writer internally so this byte slice buffering is unnecessary and costly now.	2016-01-05 14:46:07 -07:00
Jason Wilder	d2889ecd6a	Avoid creating slices of all keys during compaction	2016-01-05 09:38:00 -07:00
Jason Wilder	7794b9c5d4	Fix panic: runtime error: slice bounds out of range The block count was an uint16 when incrementing the index location which was an int32. This caused the value the uint16 value to overflow before the index location was incremented causing the wrong location to be read on the next iteration of the loop. This triggers the slice out of range errors. Added a test that recreates the panic seen in #5257 and possibly #5202 which is older code. Fixes #5257	2016-01-04 11:20:24 -07:00
Paul Dix	49d480cb0c	Fix races in backup/restore	2015-12-31 08:42:01 -05:00
Paul Dix	5974d37649	Fix backup test to mock out compaction	2015-12-31 08:15:13 -05:00
Paul Dix	26e1c6464a	Update backup to address PR comments	2015-12-30 18:06:51 -05:00
Paul Dix	59fbd371fc	Implement backup/restore for TSM. This changes backup and restore to work for TSM. It breaks it for b1 and bz1, but since those are getting removed it's ok. The backup runs against any host that is specified and can backup either the metasstore, a database, specific retention policy, or a specific shard. It can also take incremental backups with the `since` flag, which will only backup TSM files that have been created since that timestamp. The backup is safe to run online. However, for shards that are still hot for writes, they won't be able to create new TSM files while the backup for that single shard runs. If the backup isn't too large and the write throughput isn't too high this shouldn't be a problem since the writes will just go into the WAL cache.	2015-12-30 18:06:50 -05:00
Jason Wilder	b6da176a4b	Fix direct index size not calculated	2015-12-23 18:01:11 -07:00
Jason Wilder	f9ae8077da	Allow compactions to run when files have tombstones	2015-12-23 18:01:11 -07:00
Jason Wilder	a38c95ec85	Update compactions to run concurrently This has a few changes in it (unfortuantely). The main change is to run compactions concurrently. While implementing this, a few query and performance bugs showed up that are also fixed by this commit.	2015-12-23 18:01:11 -07:00
Jason Wilder	48d4156eac	Fix blocks not sorted correctly when chunking	2015-12-23 18:01:11 -07:00
Jason Wilder	bb2562b2ab	Return CompactionGroups from planning	2015-12-23 18:01:11 -07:00
Jason Wilder	d0ec0a15e2	Fix wrong test data setup	2015-12-23 18:01:11 -07:00
Ady	5c888b3673	Merge branch 'master' of https://github.com/influxdb/influxdb into mvadu-patch-4358 Trying to get to latest master from influxdb	2015-12-19 01:45:07 +05:30
Jason Wilder	7e97b0eafd	Fix rename temp file on windows	2015-12-18 11:57:37 -07:00
Jason Wilder	611017f4ed	Add comments	2015-12-18 10:00:07 -07:00
Jason Wilder	930174bf4d	Handle calling WriteBlock with no data gracefully	2015-12-18 09:57:16 -07:00
Jason Wilder	6bc7765b88	Handle calling write with no values to TSMWriter gracefully	2015-12-18 09:52:53 -07:00
Jason Wilder	421a127f11	Add indirectIndex.UnmarshalBinary benchmark	2015-12-17 15:38:51 -07:00
Jason Wilder	8c7e11f4cf	Aggressively clean up KeyCursor resources	2015-12-17 12:51:51 -07:00
Jason Wilder	fd2a409ea3	Skip decoding blocks that are already full	2015-12-17 12:47:05 -07:00
Jason Wilder	825296ddd8	Add comments	2015-12-16 11:30:06 -07:00
Jason Wilder	88324bf61c	Optimize indirectIndex.UnmarshalBinary further	2015-12-16 11:28:13 -07:00
Jason Wilder	70d1f45058	Load TSM files concurrently	2015-12-16 11:28:12 -07:00
Jason Wilder	737871268b	Speed up indirectIndex.UnmarshalBinary Remove a bunch of unnecessary allocations to improve startup times.	2015-12-16 11:16:17 -07:00
Jason Wilder	3893bc60e1	Speed up TSM compactor Just keep the current block for each iterator in the buffers.	2015-12-16 11:16:17 -07:00
Jason Wilder	00f570441b	Convert TSMKeyIterator to return blocks	2015-12-16 11:16:17 -07:00
Jason Wilder	59a57d8f73	Convert CacheKeyIterator to return encoded blocks	2015-12-16 11:16:17 -07:00
Jason Wilder	0623648140	Add chunking support back to TSMKeyIterator Was removed when MergeIterator was deleted.	2015-12-16 11:16:17 -07:00
Jason Wilder	31b97c3fe0	Add max points per block back for CacheKeyIterator Was removed when MergeIterator was removeed.	2015-12-16 11:16:16 -07:00
Jason Wilder	45e87cdfe4	Strip checksum when returning block from ReadBytes	2015-12-16 11:16:16 -07:00
Jason Wilder	97435b9124	Return minTime/maxTime from BlockIterator.Read	2015-12-16 11:16:16 -07:00
Jason Wilder	ce6de9728e	Add test for BlockIterator with multiple blocks for a key	2015-12-16 11:16:16 -07:00
Jason Wilder	4a3037814f	Add WriteBlock to TSMWriter	2015-12-16 11:16:16 -07:00
Jason Wilder	d99c1f944e	Add BlockIterator for reading TSM blocks without decoding	2015-12-16 11:16:16 -07:00
Jason Wilder	928aef04cd	Split data_file.go into reader.go and writer.go	2015-12-16 11:16:16 -07:00
Alexandre Viau	ad1044dde9	typo: unkown -> unknown	2015-12-15 18:10:47 -05:00
Philip O'Toole	01ac0b3f23	Tweak compaction log messages	2015-12-15 10:33:13 -08:00
Philip O'Toole	a6cdb5229d	Log tsm initialization	2015-12-14 15:50:56 -08:00
Philip O'Toole	03f8cd3956	Add comment explaining magic number	2015-12-10 11:46:40 -08:00
Jason Wilder	631ecc23de	Fix growing destination buffer during WAL entry encoding The test to see if the destination buffer for encoding and decoding a WAL entry was broken and would cause a panic if there were large batches that would overflow the buffer size. Fixes #5075	2015-12-10 11:46:40 -08:00
Ady	07c0939fe1	Added logic To let the memeory mapped files to renamed by OS. Now a copy is created in memory with SHARED_DELETE flag, so that OS is free to rename or delete original file	2015-12-10 01:07:50 +05:30
Jason Wilder	992aea7bd3	Merge pull request #5060 from influxdb/jw-drop-db Cancel writing TSM files when engine closes	2015-12-08 16:16:07 -07:00
Paul Dix	b192136887	Merge pull request #5058 from influxdb/pd-update-compaction-logic Update TSM compaction logic	2015-12-08 18:14:15 -05:00
Paul Dix	27cc2ea0cc	Update compact.Plan	2015-12-08 18:01:31 -05:00
Jason Wilder	d7cff651d1	Cancel writing TSM files when engine closes If the engine is closed while a compaction is going on, the close call blocks until the goroutine exits. This could be several minutes because the control does not return back up to the channel selector while there is still data to write.	2015-12-08 15:41:53 -07:00
Paul Dix	96445a53a7	Update TSM compaction logic * Update compaction to look at newest files of the smallest step first * Update compaction to look at older files in larger steps if newer files don't have enough small steps to compact * Changed the TestDefaultCompactionPlanner_CombineSequence test to reflect what's possible now. We'd only have multiple files in the same generation if the all files but one were over the max allowable size. * Clean up the logic on when full compactions are run and when planning can be skipped	2015-12-08 17:33:38 -05:00
Jason Wilder	62cb3a1e9b	Merge pull request #5057 from influxdb/jw-5046 Fix leaking TSM files when compacting	2015-12-08 13:11:46 -07:00
Jason Wilder	3543917a74	Avoid allocating strings during search	2015-12-08 13:02:17 -07:00
Jason Wilder	99c313ddae	Fix leaking TSM files when compacting The files being read were not closed after the compaction ran causing them to leak. Fixes #5046	2015-12-08 12:55:30 -07:00
Jason Wilder	9d82e24ca0	Fix performance of dropping large number of keys	2015-12-08 10:47:06 -07:00
Jason Wilder	f245b44afa	Set full compaction duration option on planner Was set on engine and not planner so it was always 0.	2015-12-08 09:56:36 -07:00
Jason Wilder	d32aeb2535	Merge pull request #5031 from influxdb/jw-mintime Dedupe points at query time if there are overlapping blocks	2015-12-07 21:28:29 -07:00
Jason Wilder	87892d79da	Dedupe points at query time if there are overlapping blocks	2015-12-07 21:10:10 -07:00
Fazal Majid	bb386219f4	ran go fmt on mmap_solaris.go #4787	2015-12-07 17:41:26 -08:00
Fazal Majid	0f889a77d1	fix tsm1 for Solaris #4787 , passes unit tests now	2015-12-07 17:14:26 -08:00
Jason Wilder	a2583d2be1	Reduce lock contention when planning TSM queries	2015-12-07 15:42:36 -07:00
Jason Wilder	4da20c49e9	Optimize TSM file scanning for time queries Move the index locations planning to be lazily created after the first seek when we know what time and direction we're searching for. This allows files and blocks to be skip before having to scan the files index. This improves queries times with time filters wherne there are many TSM files on disk.	2015-12-07 15:42:36 -07:00
Paul Dix	93d6afec97	Merge pull request #5019 from influxdb/jw-mintime Remove min time from TSM blocks	2015-12-07 15:00:12 -05:00
Paul Dix	8096c6b845	Update TSM, address PR #5011 comments * Moved TSM file extension to a constant * Fixed typos * Changed group.size() back to being a uint64 since it can have multiple files up to 4GB each.	2015-12-07 14:47:17 -05:00
Paul Dix	820b0d31d6	Update TSM to delete from the WAL/cache * Update cache loader to delete entries from cache * Add cache.Delete() * Update delete to look at keys in the Cache in addition to the FileStore * Update cache compaction to never happen if the cache is empty	2015-12-07 14:35:48 -05:00
Jason Wilder	cf341eaa6a	Remove MinTime from blocks MinTime is not in the index for each block so storing it in the block header is redundant. The encodings also store it in their header so we are actually storing it 3 times. Removing this is an incompatible change with the current tsm1 file format.	2015-12-07 11:26:58 -07:00
Adarsha	5482c6de03	Avoid closing the handle in mmap Added mmap implementation for Windows. It uses MapViewOfFile similar to Bolt's implementation. MapViewOfFile returns a pointer and not a byte array. Bolt changed their data structure to support it. Instead of changing the implementation of tsm data structure, I used a trick shown in https://groups.google.com/forum/#!topic/golang-nuts/g0nLwQI9www to use SliceHeader to convert the pointer into a slice. Bolt's implementation also closes the file handle in mmap itself. It was resulting in a timeout, so implemented https://github.com/edsrzf/mmap-go/blob/master/mmap_windows.go logic to keep file handle open until munmap	2015-12-07 23:30:19 +05:30
Paul Dix	440a8a8a1f	Change all TSM file sizes to uint32	2015-12-07 10:12:24 -05:00
Paul Dix	937233d988	Update TSM compaction planning logic * Update Plan to do a full compaction if cold for writes * Remove MaxFileSize as a config variable from Compactor. Should be a set constant * Update Plan to keep track of if the last check was fully compacted so we can skip future planning calls * Update compact min file count to 3 so that compactions run more frequently	2015-12-07 08:26:30 -05:00
Paul Dix	1bee7d1512	Update TSM, remove old version, add config * remove rolloverTSMFileSize constant that is no longer used * remove the maxGenerationFileCount since it is no longer a limitation that's necessary with the new compaction scheme. We no longer read WAL segments as part of the compaction so memory is only used as we read in each individual key * remove minFileCount and switch to a user configurable variable * remove the mutex from WALSegmentWriter. There's never more than one open in the WAL at one time and it's not exported through any function so the lock on the WAL should be used. This simplified keeping track of the last write time and removed a bunch of unnecessary locks. * update WALSegmentWriter.Write to take the compressed bytes so that encoding and compression can occur before the call to write (while we don't hold the WAL lock) * remove a bunch of unnecessary locking in WAL.writeToLog * Add check for TSM file magic number and vesion * Remove old tsm, log, and unused cursor code * Remove references to tsm1dev everywhere except in the inspector * Clean up config options for compaction and snapshotting * Remove old TSM configuration options * Update the config.sample.toml with TSM options * Update WAL compact to force if it has been cold for writes for a configurable period of time (1h by default)	2015-12-06 18:50:39 -05:00
Philip O'Toole	6e88547a5e	Support shutting down engine goroutines This was causing races in the code, when the cache was being reloaded, because back-to-back open-and-closing of the engine during testing left goroutines running. With this change the engine is completely shutdown when Close() is called on it.	2015-12-06 09:16:38 -08:00
Philip O'Toole	0d0b919144	Integrate CacheLoader with tsm2 engine	2015-12-05 22:13:57 -08:00
Philip O'Toole	fe7b3ad134	Add CacheLoader The CacheLoader loads a given cache from a slice of segment files.	2015-12-05 22:13:57 -08:00
Philip O'Toole	4b5fb8db72	WALSegmentReader counts bytes read without error	2015-12-05 22:13:57 -08:00
Philip O'Toole	c67831bc79	Remove double-checking of error when reading WAL	2015-12-05 22:13:57 -08:00
Paul Dix	40e606cb14	Merge pull request #5003 from influxdb/jw-compaction Update compaction planning	2015-12-05 16:49:54 -05:00
Jason Wilder	33a33e6a23	Fix 32bit int overflow of constant value	2015-12-05 13:09:18 -07:00
Jason Wilder	41b24995a7	Compcation fixes	2015-12-05 12:19:28 -07:00
Philip O'Toole	7296de1fac	Merge pull request #4999 from influxdb/cache_sort Always copy the Cache values for query and merge with snapshot	2015-12-05 08:15:13 -08:00
Philip O'Toole	1b12ff9c1c	Only take write-lock for Values when necessary	2015-12-05 08:06:01 -08:00
Jason Wilder	6592615958	Updated compaction strategy This changes compacting files to merge sequences of files in lower generations up to later generations	2015-12-04 23:30:39 -07:00
Philip O'Toole	789ab10658	Merge hot cache values with snapshots This change starts by building the sequence of entries, which also allows the required size of destination buffer to be calculated. Then the buffer is allocated up-front in 1 call. Each snapshot and hot value-set is appended to the buffer. If ordering is violated at anytime, set the 'needSort' flag. Sorting, if necessary, is performed just before returning the data.	2015-12-04 20:58:02 -08:00

... 3 4 5 6 7 ...

613 Commits (fb7388cdfc169921aaac976234d5b7063677afc1)