influxdb

Commit Graph

Author	SHA1	Message	Date
Mark Rushakoff	41415cf2fb	Update godoc for tsm1 package	2017-01-02 07:30:18 -08:00
Gustav Westling	26b33307ae	Resolved PR comments on test files	2016-12-30 11:42:38 +01:00
Gustav Westling	56d98325da	Removed ineffective assignments, and added checks for errors that previsouly was not checked	2016-12-29 20:26:15 +01:00
Jonathan A. Sternberg	ec57108520	Use proper uber-go/zap import path It looks like the real import path to the project is go.uber.org/zap instead of github.com/uber-go/zap since the example in the project references that path.	2016-12-15 08:54:14 -06:00
Jonathan A. Sternberg	21502a39e8	Switch logging to use structured logging everywhere The logging library has been switched to use uber-go/zap. While the logging has been changed to use structured logging, this commit does not change any of the logging statements to take advantage of the new structured log or new log levels. Those changes will come in future commits.	2016-12-14 10:45:15 -06:00
Jason Wilder	e8a28cfbab	Expose Shard.LastModified This returns the LastModified time of the shard. The LastModified time is the wall time when a change to the shards state occurred. It uses the WAL or FileStore to determine the max mod time.	2016-11-23 10:04:07 -07:00
Jason Wilder	e388912b6c	Fix race in findGenerations The file store stats slice is re-used which causes the race below: WARNING: DATA RACE Write at 0x00c42007e140 by goroutine 43: github.com/influxdata/influxdb/tsdb/engine/tsm1.(FileStore).Stats() /Users/jason/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/file_store.go:511 +0x22e github.com/influxdata/influxdb/tsdb/engine/tsm1.(DefaultPlanner).findGenerations() /Users/jason/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:461 +0x6f github.com/influxdata/influxdb/tsdb/engine/tsm1.(DefaultPlanner).PlanLevel() Previous read at 0x00c42007e140 by goroutine 40: github.com/influxdata/influxdb/tsdb/engine/tsm1.(DefaultPlanner).findGenerations() /Users/jason/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:463 +0x13d github.com/influxdata/influxdb/tsdb/engine/tsm1.(*DefaultPlanner).PlanOptimize()	2016-10-28 12:15:49 -06:00
Edd Robinson	0ee093f1fb	Memoize output of FileStore.Stats	2016-10-24 10:23:20 -06:00
Jason Wilder	f254b4f3ae	Allow snapshot compactions during deletes If a delete takes a long time to process while writes to the shard are occuring, it was possible for the cache to fill up and writes to be rejected. This occurred because we disabled all compactions while writing tombstone file to prevent deleted data from re-appearing after a compaction completed. Instead, we only disable the level compactions and allow snapshot compactions to continue. Snapshots already handle deleted data with the cache and wal. Fixes #7161	2016-10-18 12:14:51 -06:00
Jason Wilder	1b462312a9	Re-use decoder pools The decoders were held onto each iterator to avoid creating them all the time. Some of them have use quite a bit of memory so they can be expensive to create when querying across many series. Intead, more them to a re-usable pool where we create the minimum that could active be in use. This reduces garbage as well as makes the iterators less expensive to create.	2016-10-03 10:21:54 -06:00
Jonathan A. Sternberg	8b234546a8	Merge pull request #7204 from influxdata/1.0 Merge 1.0 branch to master	2016-08-25 15:20:30 -05:00
Jonathan A. Sternberg	10029caf2f	Support negative timestamps in the query engine Negative timestamps are now supported. We also now refuse two nanoseconds that are at the edge of the minimum time window. One of the nanoseconds we do not accept is because we need MinInt64 to be used for some internal comparisons in the TSM engine and it was causing an underflow when we subtracted one from the minimum time. The second is so we can have one minimum time that signifies the default minimum that nobody can write to (so we can implicitly rewrite the timestamp on aggregate queries) but still use the explicit timestamp if it is given to us by the user. We aren't able to tell the difference between if the user provided it or if it was implicit without those values being different. If the default minimum time is used with an aggregate query, we rewrite the time to be the epoch for backwards compatibility since we believe that's more important than supporting that extra nanosecond.	2016-08-25 12:52:41 -05:00
Ben Johnson	cc628a1097	Fix mmap dereferencing Adds a missing dereference call to `Close()` as well as fixes a tag copy issue.	2016-08-24 10:48:07 -06:00
Edd Robinson	90ff713f21	Fix base64 encoding issue in stats Fixes #7177.	2016-08-22 15:21:31 +01:00
Ben Johnson	8aa224b22d	reduce memory allocations in index This commit changes the index to point to index data in the shards instead of keeping it in-memory on the heap.	2016-08-16 14:09:00 -06:00
Jason Wilder	5576e7fedb	Simplifications	2016-07-28 20:25:37 -06:00
Jason Wilder	030f1ef622	Include full for tombstone files The path info only contained the file name which caused tombstone files to not be removed if there were queries running against a file that was compacted. This is now consistent with the TSMReader.Path which returns the full path info.	2016-07-28 20:25:37 -06:00
Jason Wilder	c3fda24cf9	Make sure all in-use files are tracked break cause the first one to be tracked and all others would leak as temp files that would not be removed until the server restarted.	2016-07-28 20:25:37 -06:00
Jason Wilder	4436e65fb9	Apply deletes to TSM files concurrently	2016-07-28 20:25:36 -06:00
Jason Wilder	fb5a143b08	Fix typos	2016-07-21 12:13:04 -06:00
Jason Wilder	822f409b31	Allow queries to complete before closing TSM files If a query was running against a file being compacted, we close the file and the query would end wherever it had read up to. This could result in queries that randomly lost data, but running them again showed the full results. We now use a reference counting approach and move the in-use files out of the way in the filestore and allow the queries to complete against the old tsm files. The new files are installed and new queries will use them. Fixes #5501	2016-07-21 12:13:04 -06:00
Edd Robinson	f37e726869	Add trace logging statements to tsdb	2016-07-21 11:14:29 +01:00
Edd Robinson	44231abcbd	Add trace logger controlled via DataLoggingEnabled	2016-07-21 11:14:29 +01:00
Edd Robinson	83cc580ff8	Tidy up logging	2016-07-21 11:14:29 +01:00
Jonathan A. Sternberg	12a33fe0d3	Add stats and diagnostics to the TSM engine Track the number of TSM files in the file store and keep engine statistics related to the number of TSM compactions.	2016-07-07 19:35:55 -05:00
Jonathan A. Sternberg	837a9804cf	Refactoring the monitor service to avoid expvar Truncate the time interval output of the monitor service to be on even time intervals rather than on every minute based on the start time. This normalizes the output from the monitor service.	2016-07-07 11:13:58 -05:00
Jason Wilder	ca6bfac01a	Fix out of order blocks returned during query If there were blocks in later TSM files that were for overwritten points or writes into the past, they could be returned more than once or out of order causing the cursor values to be unsorted. One effect of this is that graphs in graphana would render with the line going all over the place in spots. This might also cause duplicate data to be returned. Fixes #6738	2016-06-22 17:34:44 -06:00
Jason Wilder	a74ea4cbf4	Allow creating shards in a disable state For restoring a shard, we need to be able to have the shard open, but disabled. It was racy to open it and then disable it separately since writes/queries could occur in between that time.	2016-06-01 16:17:18 -06:00
Jason Wilder	0b481ff627	Fix pathalogical TSM query case This fixes a pathalogical query condition cause by and problematic structuring of TSM files based on how points were written. The condition can occur when there are multiple TSM files and a large number of points are written into the past. The earlier existing TSM files must also have points in the past and close to the present causing their time range to eclipse the later files. When this condition occurs, some queries can spend an excessive amount of time merge all the overlapping blocks. The fix was to constrain the window of overlapping blocks based on the first one we ran into. There was also a simple case in the Merge where we could skip the binary search path and just append the two inputs.	2016-05-25 09:14:17 -06:00
Jason Wilder	7fb7faaaca	Fix points already read from being returned more than once If there were duplicate points in multiple blocks, we would correctly dedup the points and mark the regions of the blocks we've read. Unfortunately, we were not excluding the already points as the cursor moved to points in the later blocks which could cause points to be return twice incorrectly. Fixes #6611	2016-05-18 17:21:10 -06:00
Cory LaNou	f415cf89ad	wip	2016-05-10 11:01:03 -05:00
Jason Wilder	d99c5e26f6	Fix memory spike when compacting overwritten points If a large series contains a point that is overwritten, the compactor would load the whole series into RAM during a full compaction. If the series was large, it could cause very large RAM spikes and OOMs. The change reworks the compactor to merge blocks more incrementally similar to the fix done in #6556.	2016-05-05 22:31:30 -06:00
Jason Wilder	a0ac754802	Fix loading huge series into RAM when points are overwritten In some query scenarios, if there are a lot of points on disk spread across many blocks in TSM files and a point is overwritten near the begginning of the shard's timerange, the full series could be loaded into RAM triggering OOMs and huge allocations. The issue was that the KeyCursor code that handles overwriting points had a simple implementation that just deduped the whole series in this case. This falls over when the series is quite large. Instead, the KeyCursor has been changed to only decode blocks with updated points. It then keeps track of what section of the blocks have been read so they are not re-read when the later points are decoded. Since the points in a block are always sorted, the code was also changed to remove the Deduplicate calls since they end up reallocating the slice. Instead, we do a sorted merge and re-use the slice as much as we can.	2016-05-05 09:34:44 -06:00
Jason Wilder	c8bd41c2d8	Remove TSM reader Keys func It's very inneficient and should never be used.	2016-04-27 13:09:52 -06:00
Jason Wilder	97504a552c	Support time range tombstones in FileStore/KeyCursor	2016-04-27 13:09:52 -06:00
Jason Wilder	a789e819a3	Remove NewTSMReaderWithOptions There are two TSMIndex implementations, the directIndex and the indirectIndex. Originally, we only had the directIndex and later added the indirectIndex and NewTSMReaderWithOptions in order to allow both indexes to be used in tests and code. This has created a problem since we really only use the directIndex for writing and always use the indirectIndex for reading. This changes removes the NewTSMReaderWithOptions func so that it is no longer possible to create a TSMReader with a directIndex. This will allow a lot of the block reading code used by the directIndex to be removed and simplify maintainence. It also gives better test coverage of the code that is actually used by the TSM engine now.	2016-04-27 13:09:52 -06:00
Ben Johnson	286072f65a	update dep: simple8b @ b421ab40	2016-04-22 09:46:05 -06:00
Stephen Gutekanst	9dc09c5257	Make logging output location more programmatically configurable (#6213 ) This has various benefits: - Users embedding InfluxDB within other Go programs can specify a different logger / prefix easily. - More consistent with code used elsewhere in InfluxDB (e.g. services, other `run.Server.` fields, etc). - This is also more efficient, because it means `executeQuery` no longer allocates a single `log.Logger` each time it is called.	2016-04-20 21:07:08 +01:00
Seif Lotfy	c6e3c87e00	Add Block checksum validation and "influx_inspect verify" tool Fixes #5502	2016-04-19 22:33:03 +02:00
Pierre Fersing	29b19a2293	Fix deadlock in tsm1/file_store	2016-04-12 09:39:21 +02:00
Ben Johnson	525e22c92b	tsm1 query engine alloc reduction This commit makes a number of performance improvements to reduce allocations during query execution. Several objects and buffers are now reused across the components to avoid allocations. Previously a simple `count(value)` query across 1M points would require 26,000+ allocations. After the changes in this commit that number has been reduced to 88.	2016-04-11 14:50:59 -06:00
Jason Wilder	1b08e2dd55	Use walk func to load all tsm keys to index Avoids allocating a big map or all keys.	2016-03-29 12:59:26 -06:00
Jason Wilder	03ced4cc90	Load shards concurrently	2016-03-29 12:58:52 -06:00
Ben Johnson	6e1c1da25b	reduce allocations in query execution This commit removes some heap objects by converting them from pointer references to non-pointers or by reusing buffers.	2016-03-22 09:51:39 -06:00
Jason Wilder	7567453c9a	Ensure TSM files are fsync'd Make sure TSM files are fsync'd when closed and also that the parent dir is fsync'd when they are renamed.	2016-03-21 15:03:52 -06:00
Jason Wilder	9984cd5d6d	Fix skipping blocks at query time when overlaps exist Depending on how data is written across TSM files, it was possible to skip over some blocks at query time making it looks like data was missing.	2016-03-14 13:11:11 -06:00
Mark Rushakoff	cdcb079769	Tag TSM stats with database, retention policy ... by extracting the db/rp from the given path. Now that the code has "standardized" on extracting db/rp this way, the ShardLocation struct is no longer necessary and thus has been removed. We're back on the previous style of passing the path and walPath to NewShard.	2016-02-29 09:17:34 -08:00
Jason Wilder	8d70d65a82	Convert time.Time to int64	2016-02-25 15:15:01 -07:00
Mark Rushakoff	602043e11b	Add disk stats for FileStore	2016-02-19 16:37:34 -08:00
Ben Johnson	b8918a780c	integer support	2016-02-10 09:40:25 -07:00

1 2

89 Commits (bd8dd9a29107e1cb5e7b5674779f332efa6eb3d6)