influxdb

Commit Graph

Author	SHA1	Message	Date
Jason Wilder	90df803802	Prevent infinite scheduling loop One shard might be able to run a compaction, but could fail to limits being hit. This loop would continue indefinitely as the same task would continue to be rescheduled.	2017-10-03 10:48:14 -06:00
Jason Wilder	4ff4ba0841	Use first file in generation for level With higher cardinality or larger series keys, the files can roll over early which causes them to take longer to be compacted by higher levels. This causes larger disk usage and higher numbers of tsm files at times.	2017-10-03 10:48:14 -06:00
Jason Wilder	71071ed67a	Add compaction backlog stat This gives an indication as to whether compactions are backed up or not.	2017-10-03 10:48:14 -06:00
Jason Wilder	16ece490ef	Reduce allocation in tsmKeyIterator.Next The chunked slice is unnecessary and we can re-use k.blocks throughout the compaction.	2017-10-03 10:48:14 -06:00
Jason Wilder	2c5006fccc	Rework snapshotting concurrency This switches the thresholds that are used for writing snapshots concurrently. This scales better than the prior model.	2017-10-03 10:48:14 -06:00
Jason Wilder	3af9c7df37	Remove a defer allocation Shows up under high cardinality compactions.	2017-10-03 10:48:14 -06:00
Jason Wilder	70817350b7	Ensure temp index files are cleaned up on error	2017-10-03 10:48:14 -06:00
Jason Wilder	a5afaf7499	Fix cache mem size not including key size	2017-10-03 10:48:14 -06:00
Jason Wilder	ae821f4e2d	Rework compaction scheduling This changes the compaction scheduling to better utilize the available cores that are free. Previously, a level was planned in its own goroutine and would kick off a number of compactions groups. The problem with this model was that if there were 4 groups, and 3 completed quickly, the planning would be blocked for that level until the last group finished. If the compactions at the prior level are running more quickly, a large backlog could accumlate. This now moves the planning to a single goroutine that plans each level in succession and starts as many groups as it can. When one group finishes, the planning will start the next group for the level.	2017-10-03 10:48:13 -06:00
Jason Wilder	f668b0cc3f	Only use O_SYNC for tsm file writing Doing this for the WAL reduces throughput quite a bit.	2017-10-03 10:48:13 -06:00
Jason Wilder	1610ae5727	Don't return tsm files part of a compaction plan	2017-10-03 10:48:13 -06:00
Joe LeGasse	1525069213	Merge pull request #8892 from influxdata/jl-tag-values auth: add series auth to 'show tag values'	2017-10-03 08:47:39 -04:00
Lyon Hill	7e5fd14e8a	add in some optimization	2017-10-02 12:02:38 -06:00
Lyon Hill	a6cbce0d3e	fix issues brought up by joe	2017-10-02 11:41:03 -06:00
Lyon Hill	38dc837910	Fix a minor memory leak when batching points for some services. fixes #8895	2017-10-02 11:26:25 -06:00
Joe LeGasse	1443b22379	auth: add series auth to 'show tag values'	2017-09-27 20:01:18 -04:00
Edd Robinson	e0cba4477c	Merge pull request #8885 from influxdata/er-entry-race Fix race on Cache entry	2017-09-27 18:42:45 +01:00
Edd Robinson	d0b81c1e6c	Fix race on Cache entry	2017-09-27 18:10:23 +01:00
Edd Robinson	a1b67160f6	Use math/bits in encoder	2017-09-26 12:51:08 +01:00
Jason Wilder	7fed382dbf	Merge pull request #8872 from influxdata/jw-mmap Fix long process stalls	2017-09-25 14:49:35 -06:00
Jason Wilder	122a74c692	Use synchronous IO for wal and tsm writing The fysncs due to large writes when writing to TSM files and the WAL can eventually cause large pauses. Since we already buffer writes, using synchronous IO reduces fsync latency by ensuring the individiual writes hit disk. This spreads out the latecncy across multiple writes better.	2017-09-25 12:44:57 -06:00
Edd Robinson	2def219f09	Refactor Shard to further protect Engine	2017-09-25 17:43:30 +01:00
Edd Robinson	4a67f92acc	Prevent store from directly accessing Shard's engine	2017-09-25 17:43:01 +01:00
Edd Robinson	8e9cabbb9c	Fix race in TagValues when reaching into engine	2017-09-25 17:43:01 +01:00
Edd Robinson	7739ff749a	Ensure engine protected by shard mutex	2017-09-25 17:42:30 +01:00
Jason Wilder	5774b44a4c	Remove MADV_RANDOM This was inadvertently added when merging the solaris and unix mmap files. This causes large delays due to major page faults.	2017-09-25 10:25:06 -06:00
Edd Robinson	ea104596f0	Implement TSI index versioning This commit adds a basic TSI versioning scheme, by adding a Version field to an index's MANIFEST file. Existing TSI indexes will not have this field present in their MANIFEST files, and thus will be deemed incomatible with the current version. Users with existing TSI indexes will be able to remove them, and convert the resulting inmem indexes to the current version of a TSI index using the influx_inspect tooling.	2017-09-22 17:59:39 +01:00
Jason Wilder	1e345aa7a1	Merge pull request #8856 from influxdata/jw-cache Snapshot compaction improvements	2017-09-22 10:45:54 -06:00
Edd Robinson	44691847e9	Merge branch 'master' into er-8678-tsi1-where	2017-09-22 16:54:49 +01:00
Jason Wilder	94aba64b88	Re-use index entries slice when writing TSM index	2017-09-21 12:48:16 -06:00
Jason Wilder	db204f3eb7	Default concurrent compactions to 50% of available cores	2017-09-21 12:48:11 -06:00
Jason Wilder	deef0c5649	Fix 32bit alignment	2017-09-20 10:00:20 -06:00
Jason Wilder	61ca1243c7	Increase index disk writer buffer	2017-09-20 09:05:30 -06:00
Jason Wilder	796de3dcea	Reduce encoder pool checkout contention With higher cardinalities, the encoder pools where become a bottleneck. This changes the snapshot compactions ot checkout one encoder of each type and re-use it while writing the snapshots as opposed to repeatedly checking it out and in.	2017-09-19 15:27:26 -06:00
Jason Wilder	391a6288c6	Write parallel snapshot for higher cardinalities	2017-09-19 15:27:26 -06:00
Jason Wilder	0d52b060df	Skip onFileStoreReplace with tsi	2017-09-19 15:27:25 -06:00
Jason Wilder	4fe81aeee6	Remove manual Gosched from compactions At higher cardinalities, this dramatically slows down compaction throughput.	2017-09-19 15:27:25 -06:00
Jason Wilder	31e785d676	Don't deduplicate a single value	2017-09-19 15:27:25 -06:00
Jason Wilder	2ca9ccee1f	Reset snapshot cache outside of write lock	2017-09-19 15:27:25 -06:00
Jason Wilder	ddeba2c86b	Split large snapshots and write concurrently	2017-09-19 15:27:25 -06:00
Jason Wilder	9ee305f6f5	Periodically re-allocate cache store This perioically re-allocates the cache store to avoid memory fragmentation and gradual slow down of the store after repeated deletes and inserts into the map.	2017-09-19 15:27:25 -06:00
Jason Wilder	2885b9b310	Remove entrySizeHints map There is a lot of overhead for calculating the hints for larger cardinalities. This slows down resetting the partitions in the ring.	2017-09-19 15:27:25 -06:00
Jason Wilder	4124a8ed97	Simplify cache ring The continuum slice is not needed since the number of partitions doesn't change. This removes the slice to make the mapping simpler.	2017-09-19 15:27:25 -06:00
Stuart Carnie	ed7bc9d825	fix FindValues panic for empty array	2017-09-19 14:23:32 -07:00
Stuart Carnie	92756ec0ad	Reduce allocations, improve readEntries performance by simplifying loop * callers of ReadEntries and Key API can cache allocated slice	2017-09-19 11:57:10 -07:00
Stuart Carnie	baa05de3f8	add benchmarks	2017-09-19 11:47:48 -07:00
Stuart Carnie	cfc6a1cd9f	implement optimization for Include function ``` benchmark old ns/op new ns/op delta BenchmarkIntegerValues_IncludeNone_1000-8 651 6.69 -98.97% BenchmarkIntegerValues_IncludeMiddleHalf_1000-8 1131 114 -89.92% BenchmarkIntegerValues_IncludeFirst_1000-8 638 33.9 -94.69% BenchmarkIntegerValues_IncludeLast_1000-8 1269 32.2 -97.46% BenchmarkIntegerValues_IncludeNone_10000-8 7751 6.76 -99.91% BenchmarkIntegerValues_IncludeMiddleHalf_10000-8 11582 1378 -88.10% BenchmarkIntegerValues_IncludeFirst_10000-8 7911 43.8 -99.45% BenchmarkIntegerValues_IncludeLast_10000-8 12442 38.4 -99.69% ``` (cherry picked from commit fb93ad5)	2017-09-19 09:53:28 -07:00
Stuart Carnie	ca40c1ad3c	<type>Values.Exclude function uses binary search and copy builtin ``` ± benchcmp old.txt new.txt benchmark old ns/op new ns/op delta BenchmarkIntegerValues_ExcludeNone_1000-8 1285 7.34 -99.43% BenchmarkIntegerValues_ExcludeMiddleHalf_1000-8 1258 148 -88.24% BenchmarkIntegerValues_ExcludeFirst_1000-8 1268 7.51 -99.41% BenchmarkIntegerValues_ExcludeLast_1000-8 1125 27.7 -97.54% BenchmarkIntegerValues_ExcludeNone_10000-8 12665 7.31 -99.94% BenchmarkIntegerValues_ExcludeMiddleHalf_10000-8 12039 976 -91.89% BenchmarkIntegerValues_ExcludeFirst_10000-8 12663 7.29 -99.94% BenchmarkIntegerValues_ExcludeLast_10000-8 10990 34.9 -99.68% ``` (cherry picked from commit d7a3c23)	2017-09-19 09:53:26 -07:00
Jason Wilder	940da04a34	Merge pull request #8829 from influxdata/jw-mmap Release mmap pages when shard is cold	2017-09-18 12:08:37 -06:00
Jason Wilder	31646aae3a	Release mmap pages when shard is cold This instructs the kernel that it can release memory used by mmap'd TSM files when they are not actively being used. It the mappings are use, the kernel will fault the pages back in. On linux, this causes RES memory to drop immediately when run.	2017-09-18 11:51:51 -06:00
Edd Robinson	e39de3e427	Merge pull request #8782 from oiooj/pr-shard-fix Correctly check if the Shard is ready for queries or writes	2017-09-18 18:17:19 +01:00
Jonathan A. Sternberg	2228b91b0d	Unsigned data type parsing and prioritization	2017-09-14 12:28:13 -05:00
Jason Wilder	7d467c2047	Fix windows unmapping of anonymous index slice	2017-09-12 10:30:10 -06:00
Jason Wilder	b4b3c159cc	Fixup rebase	2017-09-11 17:04:10 -06:00
Jason Wilder	d5d9f9acfe	Remove debug line	2017-09-11 15:31:28 -06:00
Jason Wilder	26f92ce6ac	Remove commented out code	2017-09-11 15:30:05 -06:00
Jason Wilder	820856347c	Don't use disk temp file for snapshots	2017-09-11 15:29:26 -06:00
Jason Wilder	4ed9c75896	Fix unmapping anonymous memory slice	2017-09-11 15:29:26 -06:00
Jason Wilder	97f7857715	Remove mutex on TSMWriter This isn't used by more than one goroutine so locks are unnecessary.	2017-09-11 15:29:26 -06:00
Jason Wilder	a93a5e9bdf	Include the size of the key in the cache size	2017-09-11 15:29:26 -06:00
Jason Wilder	38460ec37e	Re-enable compactions during writes A cold shard that suddenly receives a lot of writes could get a very big cache that takes a long time to snapshot or causes the cache max memory limit to be hit more quickly. This re-enables the compactions if necessary during writes so we don't have to wait for the shard monitor goroutine to re-enable them.	2017-09-11 15:29:26 -06:00
Ben Johnson	ee4d3c7b3d	Invalidate all bloom filters.	2017-09-11 15:29:26 -06:00
Ben Johnson	3c2487b97a	Clean up tsi bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Ben Johnson	6af936ee61	Fix bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Ben Johnson	a40b2bb210	Simplify bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Edd Robinson	408a78d904	Increase size of SeriesBlock partition	2017-09-11 15:29:26 -06:00
Jason Wilder	7388eb9499	Use disk when writing TSM index	2017-09-11 15:29:25 -06:00
Ben Johnson	0ec2736f23	Incrementally rebuild tsi bloom filters.	2017-09-11 15:29:25 -06:00
Jason Wilder	a5a2957567	Reduce allocation in log_file	2017-09-11 15:29:25 -06:00
Jason Wilder	d3e832b462	Use offheap memory for indirect index offsets slice	2017-09-11 15:29:25 -06:00
Jason Wilder	91eb9de341	Use existing TSMReader from file store during compactions Compactions would create their own TSMReaders for simplicity. With very high cardinality compactions, creating the reader and indirectIndex can start to use a significant amount of memory. This changes the compactions to use a reader that is already allocated and managed by the FileStore.	2017-09-11 15:29:25 -06:00
Jason Wilder	739ecd2ebd	Fix a compaction planning bug There was a race where the plan returned was for files that were just compacted so the compaction would immediately abort.	2017-09-11 15:26:25 -06:00
Jason Wilder	bc4fb0ea10	Sort index entries if necessary These are already sorted during compaction, so switch to sorting lazily to avoid the CPU and allocations. This would only occur when using if using the writer directly.	2017-09-11 15:26:25 -06:00
Jason Wilder	a9e89ede75	Reduce lock contenton on Index Stat and Size are read-only and can take an RLock.	2017-09-11 15:26:25 -06:00
Jason Wilder	f18dec6a4a	Use sorted slice for writing TSM index The directIndex used by the TSMWriter maintained a map of series keys to index entries. When the index is written to the TSM file, the keys are sorted and then written out in order. The reason for this is because directIndex used to be the only index and it was optimized more for reading. The reading has been replaced by the indirectIndex so the map of keys ends up wasting space. During compactions, the series keys (and index entries) are already sorted so this change uses the sorting to avoid the map and sort when writing the index. This reduces allocations and CPU usage quite a bit for larger cardinality TSM files.	2017-09-11 15:26:24 -06:00
Jason Wilder	2a0d7935d7	Switch level 3 compactions to use fast compaction strategy This leaves the slower compactions that create full blocks to only the full compaction. This helps reduce CPU usage and memory while shards are hot, but increases disk usage (reduced compression) slightly.	2017-09-11 15:26:24 -06:00
Jason Wilder	94e229ff59	Merge branch 'master' into jw-drop-series	2017-09-08 15:34:32 -06:00
Jason Wilder	44e1d3f185	Merge pull request #8804 from influxdata/jw-wal-oom Fix increased memory usage in cache and wal	2017-09-08 15:10:53 -06:00
Jason Wilder	78922f9821	Set rc to nil when closing WALSegmentReader	2017-09-08 14:55:02 -06:00
Joe LeGasse	4fb35b373b	auth: apply series auth to TSI	2017-09-08 09:09:53 -04:00
Jason Wilder	b9b648e2a0	Dynamically allocate cache store The cache store can be memory intensive with many shards. This lazyily allocates it when needed and frees it when the cache is empty and cold.	2017-09-07 16:35:08 -06:00
Jason Wilder	5581f8b4ae	Re-use WALSegmentReaders at startup	2017-09-07 12:56:17 -06:00
Jason Wilder	e39276b96f	Skip reading 0 byte wal segments	2017-09-07 12:24:54 -06:00
Jason Wilder	a8d9eeef36	Reduce lock contention when deleting high cardinality series Deleting high cardinality series could take a very long time, cause write timeouts as well as dead lock the process. This fixes these issue to by changing the approach for cleaning up the indexes and reducing lock contention. The prior approach delete each series and updated every index (inmem) during the delete. This was very slow and cause the index to be locked while it items in a slice were removed one by one. This has been changed to mark series as deleted and then rebuild the index asynchronously which speeds up the process. There was also a dead lock that could occur when deleing the field set. Deleting the field set held a write lock and the function it invoked under the lock could try to take a read lock on the field set. This would then deadlock. This approach was also very slow and caused time out for writes. It now uses faster approach that checks for the existing of the measurment in the cache and filestore which does not take write locks.	2017-09-07 11:36:02 -06:00
Jonathan A. Sternberg	e18425757d	Merge pull request #8791 from influxdata/js-explain-cached-values Include the number of scanned cached values in the iterator cost	2017-09-06 16:00:30 -05:00
Jonathan A. Sternberg	590be193e5	Include the number of scanned cached values in the iterator cost	2017-09-06 15:41:07 -05:00
Stuart Carnie	4a6114028c	exported UnloadIndex checks for ready state	2017-09-05 11:22:13 -07:00
kun	8a283e248c	Correctly check if the Shard is ready for queries or writes	2017-09-03 15:14:58 +08:00
Jonathan A. Sternberg	091ea5f9a5	Merge pull request #8776 from influxdata/js-explain-plan Initial implementation of explain plan	2017-09-01 16:19:37 -05:00
Edd Robinson	51e886ba66	Merge pull request #8757 from oiooj/pr-cl Fix panic when the engine already closed in a shard	2017-09-01 16:59:12 +01:00
Jonathan A. Sternberg	50d404e690	Initial implementation of explain plan It prints the statistics of each iterator that will access the storage engine. For each access of the storage engine, it will print the number of shards that will potentially be accessed, the number of files that may be accessed, the number of series that will be created, the number of blocks, and the size of those blocks.	2017-09-01 09:01:10 -05:00
Jonathan A. Sternberg	466fc9026e	Reduce how long it takes to walk the varrefs in an expression This is used quite a bit to determine which fields are needed in a condition. When the condition gets large, the memory usage begins to slow it down considerably and it doesn't take care of duplicates.	2017-08-31 09:33:45 -05:00
Joe LeGasse	732a0c2eaa	Merge pull request #8769 from influxdata/jl-map-cleanup cleanup: remove poor usage of ',ok' with maps	2017-08-31 09:18:42 -04:00
Ben Johnson	1dbe0662d8	Use system cursors for measurement, series, and tag key meta queries.	2017-08-30 08:35:20 -06:00
Joe LeGasse	a95647b720	cleanup: remove poor usage of ',ok' with maps There are several places in the code where comma-ok map retrieval was being used poorly. Some were benign, like checking existence before issuing an unconditional delete with no cleanup. Others were potentially far more serious: assuming that if 'ok' was true, then the resulting pointer retrieved from the map would be non-nil. `nil` is a perfectly valid value to store in a map of pointers, and the comma-ok syntax is meant for when membership is distinct from having a non-zero value. There was only one or two cases that I saw that being used correctly for maps of pointers.	2017-08-30 09:49:31 -04:00
Stuart Carnie	51eb85193c	release lock to avoid dead lock when calling WalkWhereForSeriesIDs * WalkWhereForSeriesIDs may call SeriesIDs, which may attempt to upgrade from a `RLock` to a `Lock`, causing the dead lock	2017-08-29 16:12:51 -07:00
kun	5d5225e77d	Fix panic when engine closed in a shard	2017-08-29 17:22:45 +08:00
Stuart Carnie	0ced270197	fix race condition reading map	2017-08-28 13:36:49 -07:00
Edd Robinson	d011e43a1b	Address feedback	2017-08-23 10:47:01 +01:00
Edd Robinson	a5f4b929c9	Ensure Skip is called in test goroutine	2017-08-23 10:47:01 +01:00

1 2 3 4 5 ...

1934 Commits (0614ebb1d1b0bf0523a88196fbf270e14cb21acb)