influxdb

Commit Graph

Author	SHA1	Message	Date
Jason Wilder	cbbbe8bedb	Delete series in batches This fixes a regression where deleting series keys would happen one at a time instead of in bulk.	2017-10-24 11:06:21 -06:00
Jason Wilder	05131f4453	Fix indirectIndex not removing fully deleted series If multiple tombstones exists for a series that ended up causing the full data to be deleted, the blocks were not removed from the offsets in the index. This causes the TSMReader to report that a key exist but does not have any data. During a compaction, every key should have at least one value. Since this invariant was broken, the compaction aborted early and ends up dropping all series keys that are lexigraphically greater than where the breakage occured. This would cause data to be dropped during the compaction.	2017-10-18 18:16:41 -06:00
Jason Wilder	9f102adabe	Abort BlockIterator iteration if deletes detected This fixes a potential bug where the BlockIterator would skip blocks if the underlying TSMReader had deletes on it concurrently. This could possibly occur due to changes in `91eb9de3` that now use the existing TSMReaders from the FileStore instead of creating new ones during compaction.	2017-10-18 18:16:37 -06:00
Jason Wilder	4d171f3f40	Fix data deleted outside of time range	2017-10-18 13:39:47 -06:00
Stuart Carnie	a0848eac8c	remove unnecessary err value readKey never sets error, so it is always nil	2017-10-12 08:28:53 -07:00
Jason Wilder	3af9c7df37	Remove a defer allocation Shows up under high cardinality compactions.	2017-10-03 10:48:14 -06:00
Stuart Carnie	92756ec0ad	Reduce allocations, improve readEntries performance by simplifying loop * callers of ReadEntries and Key API can cache allocated slice	2017-09-19 11:57:10 -07:00
Jason Wilder	31646aae3a	Release mmap pages when shard is cold This instructs the kernel that it can release memory used by mmap'd TSM files when they are not actively being used. It the mappings are use, the kernel will fault the pages back in. On linux, this causes RES memory to drop immediately when run.	2017-09-18 11:51:51 -06:00
Jason Wilder	7d467c2047	Fix windows unmapping of anonymous index slice	2017-09-12 10:30:10 -06:00
Jason Wilder	26f92ce6ac	Remove commented out code	2017-09-11 15:30:05 -06:00
Jason Wilder	4ed9c75896	Fix unmapping anonymous memory slice	2017-09-11 15:29:26 -06:00
Jason Wilder	d3e832b462	Use offheap memory for indirect index offsets slice	2017-09-11 15:29:25 -06:00
Jason Wilder	778000435a	Conver all keys from string to []byte in TSM engine This switches all the interfaces that take string series key to take a []byte. This eliminates many small allocations where we convert between to two repeatedly. Eventually, this change should propogate futher up the stack.	2017-07-28 11:00:50 -06:00
Stuart Carnie	eec80692c4	Taught tsm1 storage engine how to read and write uint64 values * introduced UnsignedValue type * leveraged existing int64 compression algorithms (RLE, Simple 8B) * tsm and WAL can read and write UnsignedValue * compaction is aware of UnsignedValue * unsigned support to model, cursors and write points NOTE: there is no support to create unsigned points, as the line protocol has not been modified.	2017-07-24 09:03:22 -07:00
Jason Wilder	5e11cdcdd7	Fix incorrect condition in OverlapsKeyRange The min key was not used in OverlapsKeyRange which caused it to return false when it should be true. This causes a bug where deletes would not write tombstones for files that actually contained the data it was supposed to delete.	2017-07-07 12:19:33 -06:00
Jason Wilder	29c2b1958e	Fix deletes triggering unnecessary compactions Tombstone files would be written to all TSM files even if the deleted keys or timerange did not exist in the TSM file. This had the side effect of causing shards to get recompacted back to the same state. If any shards or large numbers of TSM files existed, disk usage and CPU utilization would spike causing issues. This prevents tombstones being written for TSM files that could not possiby contain the series keys being deleted or if the delted time range is outside the range of the file.	2017-05-08 14:52:28 -06:00
Jason Wilder	f87fd7c7ed	Stop background compaction goroutines when shard is cold Each shard has a number of goroutines for compacting different levels of TSM files. When a shard goes cold and is fully compacted, these goroutines are still running. This change will stop background shard goroutines when the shard goes cold and start them back up if new writes arrive.	2017-05-03 16:31:57 -06:00
Jason Wilder	1bc4936336	Export Reader.ReadBytes	2017-04-28 13:20:55 -06:00
Jason Wilder	5fa8073fc2	Merge branch '1.2' into jw-merge-123	2017-04-04 11:12:06 -06:00
Jason Wilder	32c4d43952	Speed up drop measurement This reworks drop measurement to use a sorted list of series keys instead of creating an intermediate map. It remove allocations and some extra garbage that is created during drop measurement.	2017-04-03 08:57:53 -06:00
Jason Wilder	6232d5e56d	Remove defer allocations in TSMReader	2017-04-03 08:57:52 -06:00
Jason Wilder	7119ef8f29	Merge pull request #8193 from influxdata/jw-123-backports 1.2.3 backports	2017-03-23 13:31:35 -06:00
Jason Wilder	a1c84ae6f3	Add block type for BlockIterator	2017-03-23 12:49:17 -06:00
Jason Wilder	2972a3f223	Remove MMAP derefencing code This code was added to address some slow startup issues. It is believed to be the cause of some segfault panic's that occur at query time when the underlying MMAP array has been unmapped. The current structure of code makes this change unnecessary now.	2017-03-23 12:46:23 -06:00
Jason Wilder	78b7815c49	Add block type for BlockIterator	2017-03-09 09:16:59 -07:00
Jason Wilder	194c5adfaf	Fix race on t.refs Read at 0x00c42018f620 by goroutine 58: github.com/influxdata/influxdb/tsdb/engine/tsm1.(TSMReader).Close() /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/reader.go:330 +0x94 github.com/influxdata/influxdb/tsdb/engine/tsm1.(FileStore).Close() /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/file_store.go:464 +0x123 Previous write at 0x00c42018f620 by goroutine 63: sync/atomic.AddInt64() /usr/local/go/src/runtime/race_amd64.s:276 +0xb github.com/influxdata/influxdb/tsdb/engine/tsm1.(TSMReader).Unref() /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/reader.go:352 +0x43 github.com/influxdata/influxdb/tsdb/engine/tsm1.(KeyCursor).Close()	2017-01-07 12:39:45 -07:00
Mark Rushakoff	41415cf2fb	Update godoc for tsm1 package	2017-01-02 07:30:18 -08:00
Jason Wilder	1b462312a9	Re-use decoder pools The decoders were held onto each iterator to avoid creating them all the time. Some of them have use quite a bit of memory so they can be expensive to create when querying across many series. Intead, more them to a re-usable pool where we create the minimum that could active be in use. This reduces garbage as well as makes the iterators less expensive to create.	2016-10-03 10:21:54 -06:00
Jason Wilder	c2cfd63091	Avoid stat syscall when planning compactions When the planner runs, it needs to determine if any files have tombstones. The code to determine if a tombstone existed involved stating the .tombstone file. Since the planner runs very frequently when there are many shards, this causea a lot of system calls that are unnecessary. Instead, cache the results of the stats calls and only refresh them when we haven't checked at least once or we write new tombstone data. This also caches the results of the TSMReader.Stats call to avoid creating garbage.	2016-09-24 15:53:28 -06:00
Jonathan A. Sternberg	8b234546a8	Merge pull request #7204 from influxdata/1.0 Merge 1.0 branch to master	2016-08-25 15:20:30 -05:00
Jonathan A. Sternberg	10029caf2f	Support negative timestamps in the query engine Negative timestamps are now supported. We also now refuse two nanoseconds that are at the edge of the minimum time window. One of the nanoseconds we do not accept is because we need MinInt64 to be used for some internal comparisons in the TSM engine and it was causing an underflow when we subtracted one from the minimum time. The second is so we can have one minimum time that signifies the default minimum that nobody can write to (so we can implicitly rewrite the timestamp on aggregate queries) but still use the explicit timestamp if it is given to us by the user. We aren't able to tell the difference between if the user provided it or if it was implicit without those values being different. If the default minimum time is used with an aggregate query, we rewrite the time to be the epoch for backwards compatibility since we believe that's more important than supporting that extra nanosecond.	2016-08-25 12:52:41 -05:00
Ben Johnson	cc628a1097	Fix mmap dereferencing Adds a missing dereference call to `Close()` as well as fixes a tag copy issue.	2016-08-24 10:48:07 -06:00
Ben Johnson	8aa224b22d	reduce memory allocations in index This commit changes the index to point to index data in the shards instead of keeping it in-memory on the heap.	2016-08-16 14:09:00 -06:00
Jason Wilder	e8e6bc44a7	Remove defers in TSM reader read path	2016-08-02 16:39:45 -06:00
Jason Wilder	5576e7fedb	Simplifications	2016-07-28 20:25:37 -06:00
Jason Wilder	ef8ecf0e90	Apply reload tombstones in batches This keeps some memory bounds when reloading a TSM files tombstones so that the heap does not grow exceedintly fast and stay there after the deletes are applied.	2016-07-28 20:25:36 -06:00
Jason Wilder	7b8959f6f2	Apply tombstones iteratively at startup Tombstone were read fully into memory at startup which could consume a lot of RAM and OOM the process if there were a lot of deleted series and many TSM files. This now walks the tombstone file and iteratively applies the tombstone which uses significantly less RAM. This may be slightly slower in the generate cause, but should scale better.	2016-07-28 20:25:36 -06:00
Jason Wilder	822f409b31	Allow queries to complete before closing TSM files If a query was running against a file being compacted, we close the file and the query would end wherever it had read up to. This could result in queries that randomly lost data, but running them again showed the full results. We now use a reference counting approach and move the in-use files out of the way in the filestore and allow the queries to complete against the old tsm files. The new files are installed and new queries will use them. Fixes #5501	2016-07-21 12:13:04 -06:00
Mark Rushakoff	39f12e376c	Defend against some boundary errors in TSM reading	2016-07-19 15:43:27 -07:00
Jason Wilder	a0ac754802	Fix loading huge series into RAM when points are overwritten In some query scenarios, if there are a lot of points on disk spread across many blocks in TSM files and a point is overwritten near the begginning of the shard's timerange, the full series could be loaded into RAM triggering OOMs and huge allocations. The issue was that the KeyCursor code that handles overwriting points had a simple implementation that just deduped the whole series in this case. This falls over when the series is quite large. Instead, the KeyCursor has been changed to only decode blocks with updated points. It then keeps track of what section of the blocks have been read so they are not re-read when the later points are decoded. Since the points in a block are always sorted, the code was also changed to remove the Deduplicate calls since they end up reallocating the slice. Instead, we do a sorted merge and re-use the slice as much as we can.	2016-05-05 09:34:44 -06:00
Jason Wilder	57cb3fdbc0	Merge pull request #6522 from influxdata/tp-tsm-dump Dump TSM files to line protocol	2016-05-03 10:44:33 -06:00
Jason Wilder	2d09937fd2	Fix removing fully deleted index blocks If multiple tombstone entries happen to exist for the same key in a tombstone file, it was possible to panic. The first application would remove all index entries and the second time around the code still assumed entries would exist and would index into the nil slice. Also fixes a case where the range of time would fully delete all index entries, but it did not align with math.MinInt64 and math.MaxInt64. This would cause the index locations to still exist in the offset slice. This is inefficient because the BlockIterator would still scan and decode the block only to discover that all the values are deleted. We now just remove it from the offsets slice in this case since the range of values are deleted.	2016-05-02 11:36:05 -06:00
Jason Wilder	58aa65d5a8	Optimize applyTombstones When a large tombstone file existed on disk, this code was slow since it would apply each tombstone to the index one at a time causing the index to be scanned for each key. Instead, we group all the tombstones together by timestamp and apply in bulk so that the index in scan once for each set of tombstones. If we change to immuntable tombstone files, it might be better to just write a file where all the keys have the same tombstone so we can re-apply them efficiently.	2016-05-02 11:36:05 -06:00
Jason Wilder	c73c7cea25	Revert filtering index entries in BlockIterator This was the wrong fix. The real issue was the tombstones were being read incorrectly and also applied incorrectly at times. This code is slower and not necessary so reverting it.	2016-05-02 11:36:04 -06:00
Todd Persen	9eb4c1ec57	Fix typo in comment.	2016-04-29 16:26:27 -07:00
Jason Wilder	abcb559b09	Remove index meta data when series and measurements are gone This remove the dropMeta param from the tsdb.Store.DeleteSeries and lets the shard determine when to remove the meta data from the index based on what series still have data in the shard. This uncovered a nasty bug in compactions where a fully deleted series would prematurely end the compactions and not carry forward the rest of the data in the TSM file. This is now fixed as well.	2016-04-29 16:31:57 -06:00
Jason Wilder	aefd2ad08b	Add DeleteSeries and DeleteSeriesRange	2016-04-27 13:09:53 -06:00
Jason Wilder	bf3aa5857d	Don't add tombstone for timerange not contained by file	2016-04-27 13:09:53 -06:00
Jason Wilder	6042e114a1	Remove tombstoned values during compaction This will skip blocks that are fully tombstoned as well as remove points that have been removed within a block.	2016-04-27 13:09:53 -06:00
Jason Wilder	c154cd4b4a	Remove TSMReaderOptions Not used	2016-04-27 13:09:52 -06:00

1 2

85 Commits (7cae889b13f76df9fa81786231ab07b45b2cb76d)