influxdb

Commit Graph

Author	SHA1	Message	Date
Edd Robinson	12a2ff7fac	Add support for TSI shard streaming and shard size This commit firstly ensures that a shard's size on disk is accurately reported when using the tsi1 index, by including the on-disk size of the tsi1 index in the calculation. Secondly, this commit add support for shard streaming/copying when using the tsi1 index. Prior to this, a tsi1 index would not be correctly restored when streaming shards.	2017-11-28 15:57:02 +00:00
Jason Wilder	02dbe6dbd3	Fix KeyCursor not return remaing blocks If the first block that needs to be read was partially deleted such that the trailing end has no values, it was possible for the query cursor end early. This was caused by the KeyCursor.ReadFloatBlock returning no values instead of checking the remaing blocks.	2017-11-16 15:23:34 -07:00
Jason Wilder	80cd5e63af	Optimize DeleteSeriesRange This removes more allocations and speeds up some critical sections.	2017-11-13 09:02:10 -07:00
Jason Wilder	5a775c50d9	Add DeleteRangeWith This is a version of DeleteRange that take a func predicate to determine whether a series key should be deleted or not. This avoids the large slice allocations with higher cardinalities.	2017-11-13 08:50:07 -07:00
Jonathan A. Sternberg	0b7c56bcd8	Update the zap logger dependency The previous sha was taken from a revision on a devel branch that I thought would continue staying in the tree after it was merged. That revision was rebased away and the API was changed for the logger. This updates the usage of the logger and adds a simple package for constructing the base logger. The 1.0 version of zap changed the format of the default console logger so this change moves over to this new logger instead of attempting to retain backwards compatibility with the old format.	2017-11-10 16:27:16 -06:00
Stuart Carnie	e9313876ab	EXPLAIN ANALYZE * Introduces EXPLAIN ANALYZE command, which produces a detailed tree of operations used to execute the query. introduce context.Context to APIs metrics package * create groups of named measurements * safe for concurrent access tracing package EXPLAIN ANALYZE implementation for OSS Serialize EXPLAIN ANALYZE traces from remote nodes use context.Background for tests group with other stdlib packages additional documentation and remove unused API use influxdb/pkg/testing/assert remove testify reference	2017-10-20 08:01:37 -07:00
Jason Wilder	778000435a	Conver all keys from string to []byte in TSM engine This switches all the interfaces that take string series key to take a []byte. This eliminates many small allocations where we convert between to two repeatedly. Eventually, this change should propogate futher up the stack.	2017-07-28 11:00:50 -06:00
Stuart Carnie	eec80692c4	Taught tsm1 storage engine how to read and write uint64 values * introduced UnsignedValue type * leveraged existing int64 compression algorithms (RLE, Simple 8B) * tsm and WAL can read and write UnsignedValue * compaction is aware of UnsignedValue * unsigned support to model, cursors and write points NOTE: there is no support to create unsigned points, as the line protocol has not been modified.	2017-07-24 09:03:22 -07:00
Jason Wilder	3839fe34ea	Remove FileStore.Add/Remove Can use Replace which handles files in-use and stats correctly.	2017-04-28 13:20:55 -06:00
Jonathan A. Sternberg	2fe48d6781	Rename zap import back to github.com/uber-go/zap They rebased a revision we were previously relying upon that allowed us to use the vanity name so we are reverting back to an older version with the old import path.	2017-02-17 17:17:22 -06:00
Gustav Westling	56d98325da	Removed ineffective assignments, and added checks for errors that previsouly was not checked	2016-12-29 20:26:15 +01:00
Jonathan A. Sternberg	ec57108520	Use proper uber-go/zap import path It looks like the real import path to the project is go.uber.org/zap instead of github.com/uber-go/zap since the example in the project references that path.	2016-12-15 08:54:14 -06:00
Jonathan A. Sternberg	21502a39e8	Switch logging to use structured logging everywhere The logging library has been switched to use uber-go/zap. While the logging has been changed to use structured logging, this commit does not change any of the logging statements to take advantage of the new structured log or new log levels. Those changes will come in future commits.	2016-12-14 10:45:15 -06:00
Edd Robinson	0ee093f1fb	Memoize output of FileStore.Stats	2016-10-24 10:23:20 -06:00
Jason Wilder	1b462312a9	Re-use decoder pools The decoders were held onto each iterator to avoid creating them all the time. Some of them have use quite a bit of memory so they can be expensive to create when querying across many series. Intead, more them to a re-usable pool where we create the minimum that could active be in use. This reduces garbage as well as makes the iterators less expensive to create.	2016-10-03 10:21:54 -06:00
Jason Wilder	822f409b31	Allow queries to complete before closing TSM files If a query was running against a file being compacted, we close the file and the query would end wherever it had read up to. This could result in queries that randomly lost data, but running them again showed the full results. We now use a reference counting approach and move the in-use files out of the way in the filestore and allow the queries to complete against the old tsm files. The new files are installed and new queries will use them. Fixes #5501	2016-07-21 12:13:04 -06:00
Jason Wilder	ca6bfac01a	Fix out of order blocks returned during query If there were blocks in later TSM files that were for overwritten points or writes into the past, they could be returned more than once or out of order causing the cursor values to be unsorted. One effect of this is that graphs in graphana would render with the line going all over the place in spots. This might also cause duplicate data to be returned. Fixes #6738	2016-06-22 17:34:44 -06:00
Jason Wilder	0b481ff627	Fix pathalogical TSM query case This fixes a pathalogical query condition cause by and problematic structuring of TSM files based on how points were written. The condition can occur when there are multiple TSM files and a large number of points are written into the past. The earlier existing TSM files must also have points in the past and close to the present causing their time range to eclipse the later files. When this condition occurs, some queries can spend an excessive amount of time merge all the overlapping blocks. The fix was to constrain the window of overlapping blocks based on the first one we ran into. There was also a simple case in the Merge where we could skip the binary search path and just append the two inputs.	2016-05-25 09:14:17 -06:00
Edd Robinson	f680ab0f0d	Fix vet issues	2016-05-18 13:34:11 +01:00
Cory LaNou	f415cf89ad	wip	2016-05-10 11:01:03 -05:00
Jason Wilder	d99c5e26f6	Fix memory spike when compacting overwritten points If a large series contains a point that is overwritten, the compactor would load the whole series into RAM during a full compaction. If the series was large, it could cause very large RAM spikes and OOMs. The change reworks the compactor to merge blocks more incrementally similar to the fix done in #6556.	2016-05-05 22:31:30 -06:00
Jason Wilder	a0ac754802	Fix loading huge series into RAM when points are overwritten In some query scenarios, if there are a lot of points on disk spread across many blocks in TSM files and a point is overwritten near the begginning of the shard's timerange, the full series could be loaded into RAM triggering OOMs and huge allocations. The issue was that the KeyCursor code that handles overwriting points had a simple implementation that just deduped the whole series in this case. This falls over when the series is quite large. Instead, the KeyCursor has been changed to only decode blocks with updated points. It then keeps track of what section of the blocks have been read so they are not re-read when the later points are decoded. Since the points in a block are always sorted, the code was also changed to remove the Deduplicate calls since they end up reallocating the slice. Instead, we do a sorted merge and re-use the slice as much as we can.	2016-05-05 09:34:44 -06:00
Jason Wilder	97504a552c	Support time range tombstones in FileStore/KeyCursor	2016-04-27 13:09:52 -06:00
Jason Wilder	a789e819a3	Remove NewTSMReaderWithOptions There are two TSMIndex implementations, the directIndex and the indirectIndex. Originally, we only had the directIndex and later added the indirectIndex and NewTSMReaderWithOptions in order to allow both indexes to be used in tests and code. This has created a problem since we really only use the directIndex for writing and always use the indirectIndex for reading. This changes removes the NewTSMReaderWithOptions func so that it is no longer possible to create a TSMReader with a directIndex. This will allow a lot of the block reading code used by the directIndex to be removed and simplify maintainence. It also gives better test coverage of the code that is actually used by the TSM engine now.	2016-04-27 13:09:52 -06:00
Ben Johnson	286072f65a	update dep: simple8b @ b421ab40	2016-04-22 09:46:05 -06:00
Ben Johnson	525e22c92b	tsm1 query engine alloc reduction This commit makes a number of performance improvements to reduce allocations during query execution. Several objects and buffers are now reused across the components to avoid allocations. Previously a simple `count(value)` query across 1M points would require 26,000+ allocations. After the changes in this commit that number has been reduced to 88.	2016-04-11 14:50:59 -06:00
Jason Wilder	9984cd5d6d	Fix skipping blocks at query time when overlaps exist Depending on how data is written across TSM files, it was possible to skip over some blocks at query time making it looks like data was missing.	2016-03-14 13:11:11 -06:00
Jason Wilder	8d70d65a82	Convert time.Time to int64	2016-02-25 15:15:01 -07:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00
Jason Wilder	756421ec4a	Look for fully compacted block in addition to max size during compaction Some data shapes would cause files to grow larger than the max size more quickly which resulted in them getting skipped by the full compaction planner at times. Some datasets that could make this happen are very large keys or very large numbers of keys (10M). When this happened, multiple max sized files would accumulate but the blocks would not be full. When the shard went cold for writes, these files would get recompacted down to the optimal size, but a lot of space would be wasted in the mean time.	2016-01-07 15:18:42 -07:00
Jason Wilder	a38c95ec85	Update compactions to run concurrently This has a few changes in it (unfortuantely). The main change is to run compactions concurrently. While implementing this, a few query and performance bugs showed up that are also fixed by this commit.	2015-12-23 18:01:11 -07:00
Jason Wilder	9d82e24ca0	Fix performance of dropping large number of keys	2015-12-08 10:47:06 -07:00
Jason Wilder	87892d79da	Dedupe points at query time if there are overlapping blocks	2015-12-07 21:10:10 -07:00
Paul Dix	1bee7d1512	Update TSM, remove old version, add config * remove rolloverTSMFileSize constant that is no longer used * remove the maxGenerationFileCount since it is no longer a limitation that's necessary with the new compaction scheme. We no longer read WAL segments as part of the compaction so memory is only used as we read in each individual key * remove minFileCount and switch to a user configurable variable * remove the mutex from WALSegmentWriter. There's never more than one open in the WAL at one time and it's not exported through any function so the lock on the WAL should be used. This simplified keeping track of the last write time and removed a bunch of unnecessary locks. * update WALSegmentWriter.Write to take the compressed bytes so that encoding and compression can occur before the call to write (while we don't hold the WAL lock) * remove a bunch of unnecessary locking in WAL.writeToLog * Add check for TSM file magic number and vesion * Remove old tsm, log, and unused cursor code * Remove references to tsm1dev everywhere except in the inspector * Clean up config options for compaction and snapshotting * Remove old TSM configuration options * Update the config.sample.toml with TSM options * Update WAL compact to force if it has been cold for writes for a configurable period of time (1h by default)	2015-12-06 18:50:39 -05:00
Jason Wilder	52bec1f7f6	Change TSM file naming to generation-sequence.tsm	2015-12-04 11:51:33 -07:00
Jason Wilder	c7e37766e7	Avoid repetitive index searches when iterating over cursors First pass at TSM cursor iteration ended up searching the file indexes too frequently and hurt performance. This changes that to search it once and then have the cursor hold onto the block locations to seek to. Doubles the query performance from the first iteration, but still a lot of room for improvement.	2015-12-04 10:02:59 -07:00
Jason Wilder	adf5c5b223	Replace Next/Prev with Scan	2015-12-03 12:39:13 -07:00
Jason Wilder	be59ba3455	Add Prev support to FileStore Allows read the previous block of values given a timestamp and key.	2015-12-03 12:39:12 -07:00
Jason Wilder	6fba01df89	Implement single field TSM queries	2015-12-03 12:35:36 -07:00
Jason Wilder	4a03469662	Integrate TSM compaction into dev engine	2015-12-02 09:45:23 -07:00
Jason Wilder	9c2be12b65	Add FileStore.Remove func Allows a TSMFile to be removed from the active set of files managed by the FileStore.	2015-11-16 09:16:10 -07:00
Jason Wilder	ef18f8afb2	Handle TSM key deletions This writes a tombstone file containing a line per deleted key. This file is read when a TSMReader is created and any keys listed in the file are removed from the index.	2015-11-16 08:44:52 -07:00
Jason Wilder	0ab423c7ff	Initial FileStore implementation Provides functionality to load a directory of TSM files (or add them manually) as well as reading blocks of values for individual key and times.	2015-11-16 08:44:52 -07:00

44 Commits (b36b9f109f2da91c8941679caf5356e08eee0b2b)