influxdb

Commit Graph

Author	SHA1	Message	Date
Stuart Carnie	a0848eac8c	remove unnecessary err value readKey never sets error, so it is always nil	2017-10-12 08:28:53 -07:00
Jason Wilder	1401950b10	Only schedule one compaction per shard at a time The scheduling logic ended up favoring more backlogged shards too much and would starved active, less backed up shards. This occurred because the scheduling kicks in once a second. When it runs, it schedules as many compactions as it can. A backed up shard would end up having more compactions to run during the loop an would generally get to schedule them more frequently. This now allows each shard to try and schedule one compaction at a time which provides a more balanced approach. At some point, we'll probably want to more directly balanc the each shards backlog vs letting it happen somewhat randomly.	2017-10-09 11:40:32 -06:00
Jason Wilder	00a403f60e	Reduce allocation in tsmKeyIterator.Next This reuses some intermediate buffers and structs while compacting files.	2017-10-04 17:35:56 -06:00
Jason Wilder	6b6ccf1a40	Wait for compaction gorotuines to finish	2017-10-04 10:01:44 -06:00
Jason Wilder	06226d6fd3	Handle orphan lower level TSM files during full planning Some files seem to get orphan behind higher levels. This causes the compactions to get blocked as the lowere level files will not get picked up by their lower level planners. This allows the full plan to identify them and pull them into their plans.	2017-10-04 08:13:14 -06:00
Jason Wilder	a1d0b52897	Allow lower priority compactions to use excess capacity If there is a backlog of level 3 and 4 compacitons, but few level 1 and 2 compactions, allow them to use some excess capacity.	2017-10-04 08:11:44 -06:00
Jason Wilder	f2a681c4cf	Unconditionally remove file when calling Remove	2017-10-03 10:49:17 -06:00
Jason Wilder	0c0505881f	Remove multiple file skipping for full compaction planning This check doesn't make sense for high cardinality data as the files typically get big and sparse very quickly. This causes a lot of extra disk space to be used which is taken up by large indexes and sparse data.	2017-10-03 10:48:14 -06:00
Jason Wilder	90df803802	Prevent infinite scheduling loop One shard might be able to run a compaction, but could fail to limits being hit. This loop would continue indefinitely as the same task would continue to be rescheduled.	2017-10-03 10:48:14 -06:00
Jason Wilder	4ff4ba0841	Use first file in generation for level With higher cardinality or larger series keys, the files can roll over early which causes them to take longer to be compacted by higher levels. This causes larger disk usage and higher numbers of tsm files at times.	2017-10-03 10:48:14 -06:00
Jason Wilder	71071ed67a	Add compaction backlog stat This gives an indication as to whether compactions are backed up or not.	2017-10-03 10:48:14 -06:00
Jason Wilder	16ece490ef	Reduce allocation in tsmKeyIterator.Next The chunked slice is unnecessary and we can re-use k.blocks throughout the compaction.	2017-10-03 10:48:14 -06:00
Jason Wilder	2c5006fccc	Rework snapshotting concurrency This switches the thresholds that are used for writing snapshots concurrently. This scales better than the prior model.	2017-10-03 10:48:14 -06:00
Jason Wilder	3af9c7df37	Remove a defer allocation Shows up under high cardinality compactions.	2017-10-03 10:48:14 -06:00
Jason Wilder	70817350b7	Ensure temp index files are cleaned up on error	2017-10-03 10:48:14 -06:00
Jason Wilder	a5afaf7499	Fix cache mem size not including key size	2017-10-03 10:48:14 -06:00
Jason Wilder	ae821f4e2d	Rework compaction scheduling This changes the compaction scheduling to better utilize the available cores that are free. Previously, a level was planned in its own goroutine and would kick off a number of compactions groups. The problem with this model was that if there were 4 groups, and 3 completed quickly, the planning would be blocked for that level until the last group finished. If the compactions at the prior level are running more quickly, a large backlog could accumlate. This now moves the planning to a single goroutine that plans each level in succession and starts as many groups as it can. When one group finishes, the planning will start the next group for the level.	2017-10-03 10:48:13 -06:00
Jason Wilder	f668b0cc3f	Only use O_SYNC for tsm file writing Doing this for the WAL reduces throughput quite a bit.	2017-10-03 10:48:13 -06:00
Jason Wilder	1610ae5727	Don't return tsm files part of a compaction plan	2017-10-03 10:48:13 -06:00
Joe LeGasse	1525069213	Merge pull request #8892 from influxdata/jl-tag-values auth: add series auth to 'show tag values'	2017-10-03 08:47:39 -04:00
Lyon Hill	7e5fd14e8a	add in some optimization	2017-10-02 12:02:38 -06:00
Lyon Hill	a6cbce0d3e	fix issues brought up by joe	2017-10-02 11:41:03 -06:00
Lyon Hill	38dc837910	Fix a minor memory leak when batching points for some services. fixes #8895	2017-10-02 11:26:25 -06:00
Joe LeGasse	1443b22379	auth: add series auth to 'show tag values'	2017-09-27 20:01:18 -04:00
Edd Robinson	e0cba4477c	Merge pull request #8885 from influxdata/er-entry-race Fix race on Cache entry	2017-09-27 18:42:45 +01:00
Edd Robinson	d0b81c1e6c	Fix race on Cache entry	2017-09-27 18:10:23 +01:00
Edd Robinson	a1b67160f6	Use math/bits in encoder	2017-09-26 12:51:08 +01:00
Jason Wilder	7fed382dbf	Merge pull request #8872 from influxdata/jw-mmap Fix long process stalls	2017-09-25 14:49:35 -06:00
Jason Wilder	122a74c692	Use synchronous IO for wal and tsm writing The fysncs due to large writes when writing to TSM files and the WAL can eventually cause large pauses. Since we already buffer writes, using synchronous IO reduces fsync latency by ensuring the individiual writes hit disk. This spreads out the latecncy across multiple writes better.	2017-09-25 12:44:57 -06:00
Edd Robinson	2def219f09	Refactor Shard to further protect Engine	2017-09-25 17:43:30 +01:00
Edd Robinson	4a67f92acc	Prevent store from directly accessing Shard's engine	2017-09-25 17:43:01 +01:00
Edd Robinson	8e9cabbb9c	Fix race in TagValues when reaching into engine	2017-09-25 17:43:01 +01:00
Edd Robinson	7739ff749a	Ensure engine protected by shard mutex	2017-09-25 17:42:30 +01:00
Jason Wilder	5774b44a4c	Remove MADV_RANDOM This was inadvertently added when merging the solaris and unix mmap files. This causes large delays due to major page faults.	2017-09-25 10:25:06 -06:00
Edd Robinson	ea104596f0	Implement TSI index versioning This commit adds a basic TSI versioning scheme, by adding a Version field to an index's MANIFEST file. Existing TSI indexes will not have this field present in their MANIFEST files, and thus will be deemed incomatible with the current version. Users with existing TSI indexes will be able to remove them, and convert the resulting inmem indexes to the current version of a TSI index using the influx_inspect tooling.	2017-09-22 17:59:39 +01:00
Jason Wilder	1e345aa7a1	Merge pull request #8856 from influxdata/jw-cache Snapshot compaction improvements	2017-09-22 10:45:54 -06:00
Edd Robinson	44691847e9	Merge branch 'master' into er-8678-tsi1-where	2017-09-22 16:54:49 +01:00
Jason Wilder	94aba64b88	Re-use index entries slice when writing TSM index	2017-09-21 12:48:16 -06:00
Jason Wilder	db204f3eb7	Default concurrent compactions to 50% of available cores	2017-09-21 12:48:11 -06:00
Jason Wilder	deef0c5649	Fix 32bit alignment	2017-09-20 10:00:20 -06:00
Jason Wilder	61ca1243c7	Increase index disk writer buffer	2017-09-20 09:05:30 -06:00
Jason Wilder	796de3dcea	Reduce encoder pool checkout contention With higher cardinalities, the encoder pools where become a bottleneck. This changes the snapshot compactions ot checkout one encoder of each type and re-use it while writing the snapshots as opposed to repeatedly checking it out and in.	2017-09-19 15:27:26 -06:00
Jason Wilder	391a6288c6	Write parallel snapshot for higher cardinalities	2017-09-19 15:27:26 -06:00
Jason Wilder	0d52b060df	Skip onFileStoreReplace with tsi	2017-09-19 15:27:25 -06:00
Jason Wilder	4fe81aeee6	Remove manual Gosched from compactions At higher cardinalities, this dramatically slows down compaction throughput.	2017-09-19 15:27:25 -06:00
Jason Wilder	31e785d676	Don't deduplicate a single value	2017-09-19 15:27:25 -06:00
Jason Wilder	2ca9ccee1f	Reset snapshot cache outside of write lock	2017-09-19 15:27:25 -06:00
Jason Wilder	ddeba2c86b	Split large snapshots and write concurrently	2017-09-19 15:27:25 -06:00
Jason Wilder	9ee305f6f5	Periodically re-allocate cache store This perioically re-allocates the cache store to avoid memory fragmentation and gradual slow down of the store after repeated deletes and inserts into the map.	2017-09-19 15:27:25 -06:00
Jason Wilder	2885b9b310	Remove entrySizeHints map There is a lot of overhead for calculating the hints for larger cardinalities. This slows down resetting the partitions in the ring.	2017-09-19 15:27:25 -06:00
Jason Wilder	4124a8ed97	Simplify cache ring The continuum slice is not needed since the number of partitions doesn't change. This removes the slice to make the mapping simpler.	2017-09-19 15:27:25 -06:00
Stuart Carnie	ed7bc9d825	fix FindValues panic for empty array	2017-09-19 14:23:32 -07:00
Stuart Carnie	92756ec0ad	Reduce allocations, improve readEntries performance by simplifying loop * callers of ReadEntries and Key API can cache allocated slice	2017-09-19 11:57:10 -07:00
Stuart Carnie	baa05de3f8	add benchmarks	2017-09-19 11:47:48 -07:00
Stuart Carnie	cfc6a1cd9f	implement optimization for Include function ``` benchmark old ns/op new ns/op delta BenchmarkIntegerValues_IncludeNone_1000-8 651 6.69 -98.97% BenchmarkIntegerValues_IncludeMiddleHalf_1000-8 1131 114 -89.92% BenchmarkIntegerValues_IncludeFirst_1000-8 638 33.9 -94.69% BenchmarkIntegerValues_IncludeLast_1000-8 1269 32.2 -97.46% BenchmarkIntegerValues_IncludeNone_10000-8 7751 6.76 -99.91% BenchmarkIntegerValues_IncludeMiddleHalf_10000-8 11582 1378 -88.10% BenchmarkIntegerValues_IncludeFirst_10000-8 7911 43.8 -99.45% BenchmarkIntegerValues_IncludeLast_10000-8 12442 38.4 -99.69% ``` (cherry picked from commit fb93ad5)	2017-09-19 09:53:28 -07:00
Stuart Carnie	ca40c1ad3c	<type>Values.Exclude function uses binary search and copy builtin ``` ± benchcmp old.txt new.txt benchmark old ns/op new ns/op delta BenchmarkIntegerValues_ExcludeNone_1000-8 1285 7.34 -99.43% BenchmarkIntegerValues_ExcludeMiddleHalf_1000-8 1258 148 -88.24% BenchmarkIntegerValues_ExcludeFirst_1000-8 1268 7.51 -99.41% BenchmarkIntegerValues_ExcludeLast_1000-8 1125 27.7 -97.54% BenchmarkIntegerValues_ExcludeNone_10000-8 12665 7.31 -99.94% BenchmarkIntegerValues_ExcludeMiddleHalf_10000-8 12039 976 -91.89% BenchmarkIntegerValues_ExcludeFirst_10000-8 12663 7.29 -99.94% BenchmarkIntegerValues_ExcludeLast_10000-8 10990 34.9 -99.68% ``` (cherry picked from commit d7a3c23)	2017-09-19 09:53:26 -07:00
Jason Wilder	940da04a34	Merge pull request #8829 from influxdata/jw-mmap Release mmap pages when shard is cold	2017-09-18 12:08:37 -06:00
Jason Wilder	31646aae3a	Release mmap pages when shard is cold This instructs the kernel that it can release memory used by mmap'd TSM files when they are not actively being used. It the mappings are use, the kernel will fault the pages back in. On linux, this causes RES memory to drop immediately when run.	2017-09-18 11:51:51 -06:00
Edd Robinson	e39de3e427	Merge pull request #8782 from oiooj/pr-shard-fix Correctly check if the Shard is ready for queries or writes	2017-09-18 18:17:19 +01:00
Jonathan A. Sternberg	2228b91b0d	Unsigned data type parsing and prioritization	2017-09-14 12:28:13 -05:00
Jason Wilder	7d467c2047	Fix windows unmapping of anonymous index slice	2017-09-12 10:30:10 -06:00
Jason Wilder	b4b3c159cc	Fixup rebase	2017-09-11 17:04:10 -06:00
Jason Wilder	d5d9f9acfe	Remove debug line	2017-09-11 15:31:28 -06:00
Jason Wilder	26f92ce6ac	Remove commented out code	2017-09-11 15:30:05 -06:00
Jason Wilder	820856347c	Don't use disk temp file for snapshots	2017-09-11 15:29:26 -06:00
Jason Wilder	4ed9c75896	Fix unmapping anonymous memory slice	2017-09-11 15:29:26 -06:00
Jason Wilder	97f7857715	Remove mutex on TSMWriter This isn't used by more than one goroutine so locks are unnecessary.	2017-09-11 15:29:26 -06:00
Jason Wilder	a93a5e9bdf	Include the size of the key in the cache size	2017-09-11 15:29:26 -06:00
Jason Wilder	38460ec37e	Re-enable compactions during writes A cold shard that suddenly receives a lot of writes could get a very big cache that takes a long time to snapshot or causes the cache max memory limit to be hit more quickly. This re-enables the compactions if necessary during writes so we don't have to wait for the shard monitor goroutine to re-enable them.	2017-09-11 15:29:26 -06:00
Ben Johnson	ee4d3c7b3d	Invalidate all bloom filters.	2017-09-11 15:29:26 -06:00
Ben Johnson	3c2487b97a	Clean up tsi bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Ben Johnson	6af936ee61	Fix bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Ben Johnson	a40b2bb210	Simplify bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Edd Robinson	408a78d904	Increase size of SeriesBlock partition	2017-09-11 15:29:26 -06:00
Jason Wilder	7388eb9499	Use disk when writing TSM index	2017-09-11 15:29:25 -06:00
Ben Johnson	0ec2736f23	Incrementally rebuild tsi bloom filters.	2017-09-11 15:29:25 -06:00
Jason Wilder	a5a2957567	Reduce allocation in log_file	2017-09-11 15:29:25 -06:00
Jason Wilder	d3e832b462	Use offheap memory for indirect index offsets slice	2017-09-11 15:29:25 -06:00
Jason Wilder	91eb9de341	Use existing TSMReader from file store during compactions Compactions would create their own TSMReaders for simplicity. With very high cardinality compactions, creating the reader and indirectIndex can start to use a significant amount of memory. This changes the compactions to use a reader that is already allocated and managed by the FileStore.	2017-09-11 15:29:25 -06:00
Jason Wilder	739ecd2ebd	Fix a compaction planning bug There was a race where the plan returned was for files that were just compacted so the compaction would immediately abort.	2017-09-11 15:26:25 -06:00
Jason Wilder	bc4fb0ea10	Sort index entries if necessary These are already sorted during compaction, so switch to sorting lazily to avoid the CPU and allocations. This would only occur when using if using the writer directly.	2017-09-11 15:26:25 -06:00
Jason Wilder	a9e89ede75	Reduce lock contenton on Index Stat and Size are read-only and can take an RLock.	2017-09-11 15:26:25 -06:00
Jason Wilder	f18dec6a4a	Use sorted slice for writing TSM index The directIndex used by the TSMWriter maintained a map of series keys to index entries. When the index is written to the TSM file, the keys are sorted and then written out in order. The reason for this is because directIndex used to be the only index and it was optimized more for reading. The reading has been replaced by the indirectIndex so the map of keys ends up wasting space. During compactions, the series keys (and index entries) are already sorted so this change uses the sorting to avoid the map and sort when writing the index. This reduces allocations and CPU usage quite a bit for larger cardinality TSM files.	2017-09-11 15:26:24 -06:00
Jason Wilder	2a0d7935d7	Switch level 3 compactions to use fast compaction strategy This leaves the slower compactions that create full blocks to only the full compaction. This helps reduce CPU usage and memory while shards are hot, but increases disk usage (reduced compression) slightly.	2017-09-11 15:26:24 -06:00
Jason Wilder	94e229ff59	Merge branch 'master' into jw-drop-series	2017-09-08 15:34:32 -06:00
Jason Wilder	44e1d3f185	Merge pull request #8804 from influxdata/jw-wal-oom Fix increased memory usage in cache and wal	2017-09-08 15:10:53 -06:00
Jason Wilder	78922f9821	Set rc to nil when closing WALSegmentReader	2017-09-08 14:55:02 -06:00
Joe LeGasse	4fb35b373b	auth: apply series auth to TSI	2017-09-08 09:09:53 -04:00
Jason Wilder	b9b648e2a0	Dynamically allocate cache store The cache store can be memory intensive with many shards. This lazyily allocates it when needed and frees it when the cache is empty and cold.	2017-09-07 16:35:08 -06:00
Jason Wilder	5581f8b4ae	Re-use WALSegmentReaders at startup	2017-09-07 12:56:17 -06:00
Jason Wilder	e39276b96f	Skip reading 0 byte wal segments	2017-09-07 12:24:54 -06:00
Jason Wilder	a8d9eeef36	Reduce lock contention when deleting high cardinality series Deleting high cardinality series could take a very long time, cause write timeouts as well as dead lock the process. This fixes these issue to by changing the approach for cleaning up the indexes and reducing lock contention. The prior approach delete each series and updated every index (inmem) during the delete. This was very slow and cause the index to be locked while it items in a slice were removed one by one. This has been changed to mark series as deleted and then rebuild the index asynchronously which speeds up the process. There was also a dead lock that could occur when deleing the field set. Deleting the field set held a write lock and the function it invoked under the lock could try to take a read lock on the field set. This would then deadlock. This approach was also very slow and caused time out for writes. It now uses faster approach that checks for the existing of the measurment in the cache and filestore which does not take write locks.	2017-09-07 11:36:02 -06:00
Jonathan A. Sternberg	e18425757d	Merge pull request #8791 from influxdata/js-explain-cached-values Include the number of scanned cached values in the iterator cost	2017-09-06 16:00:30 -05:00
Jonathan A. Sternberg	590be193e5	Include the number of scanned cached values in the iterator cost	2017-09-06 15:41:07 -05:00
Stuart Carnie	4a6114028c	exported UnloadIndex checks for ready state	2017-09-05 11:22:13 -07:00
kun	8a283e248c	Correctly check if the Shard is ready for queries or writes	2017-09-03 15:14:58 +08:00
Jonathan A. Sternberg	091ea5f9a5	Merge pull request #8776 from influxdata/js-explain-plan Initial implementation of explain plan	2017-09-01 16:19:37 -05:00
Edd Robinson	51e886ba66	Merge pull request #8757 from oiooj/pr-cl Fix panic when the engine already closed in a shard	2017-09-01 16:59:12 +01:00
Jonathan A. Sternberg	50d404e690	Initial implementation of explain plan It prints the statistics of each iterator that will access the storage engine. For each access of the storage engine, it will print the number of shards that will potentially be accessed, the number of files that may be accessed, the number of series that will be created, the number of blocks, and the size of those blocks.	2017-09-01 09:01:10 -05:00
Jonathan A. Sternberg	466fc9026e	Reduce how long it takes to walk the varrefs in an expression This is used quite a bit to determine which fields are needed in a condition. When the condition gets large, the memory usage begins to slow it down considerably and it doesn't take care of duplicates.	2017-08-31 09:33:45 -05:00
Joe LeGasse	732a0c2eaa	Merge pull request #8769 from influxdata/jl-map-cleanup cleanup: remove poor usage of ',ok' with maps	2017-08-31 09:18:42 -04:00
Ben Johnson	1dbe0662d8	Use system cursors for measurement, series, and tag key meta queries.	2017-08-30 08:35:20 -06:00
Joe LeGasse	a95647b720	cleanup: remove poor usage of ',ok' with maps There are several places in the code where comma-ok map retrieval was being used poorly. Some were benign, like checking existence before issuing an unconditional delete with no cleanup. Others were potentially far more serious: assuming that if 'ok' was true, then the resulting pointer retrieved from the map would be non-nil. `nil` is a perfectly valid value to store in a map of pointers, and the comma-ok syntax is meant for when membership is distinct from having a non-zero value. There was only one or two cases that I saw that being used correctly for maps of pointers.	2017-08-30 09:49:31 -04:00
Stuart Carnie	51eb85193c	release lock to avoid dead lock when calling WalkWhereForSeriesIDs * WalkWhereForSeriesIDs may call SeriesIDs, which may attempt to upgrade from a `RLock` to a `Lock`, causing the dead lock	2017-08-29 16:12:51 -07:00
kun	5d5225e77d	Fix panic when engine closed in a shard	2017-08-29 17:22:45 +08:00
Stuart Carnie	0ced270197	fix race condition reading map	2017-08-28 13:36:49 -07:00
Edd Robinson	d011e43a1b	Address feedback	2017-08-23 10:47:01 +01:00
Edd Robinson	a5f4b929c9	Ensure Skip is called in test goroutine	2017-08-23 10:47:01 +01:00
Edd Robinson	9be7c5aaa6	Run relevant engine tests on both indexes	2017-08-23 10:47:01 +01:00
Edd Robinson	9c12607c3e	Ensure shard tests run with both indexes	2017-08-23 10:46:59 +01:00
Edd Robinson	e732cb7a39	Update benchmarks to use sub-benchmarks	2017-08-22 17:51:48 +01:00
Edd Robinson	dd808bb77a	Ensure TSI tests run with TSI index	2017-08-22 17:51:48 +01:00
Edd Robinson	bca4393494	Run most tests for both indexes	2017-08-22 17:51:48 +01:00
Jason Wilder	d305b89f74	Merge pull request #8726 from influxdata/jw-tsm-file-leak Fix leaking tmp file when large compaction aborted	2017-08-22 09:59:23 -05:00
Stuart Carnie	2ef9b489f0	Merge pull request #8727 from influxdata/sgc-finalizer log message when iterator is closed by finalizer	2017-08-22 07:29:38 -07:00
Stuart Carnie	d189621d07	log message when iterator closed by finalizer	2017-08-21 16:46:24 -07:00
Jason Wilder	e265d150be	Fix leaking tmp file when large compaction aborted If a large compaction was running and was aborted. It could would leave some tmp files around for files that it had fully written. The current active file was cleaned up, but already completed ones would not. This would occur when a TSM file needed to rollover due to size.	2017-08-21 17:04:57 -06:00
Jonathan A. Sternberg	5ce6007347	Merge pull request #8724 from influxdata/js-remove-unused-cursor This cursor implementation appears to be completely unused	2017-08-21 17:44:51 -05:00
Jonathan A. Sternberg	c0f7a8af5b	This cursor implementation appears to be completely unused Remove it so that its existence doesn't confuse someone that this is actually the cursor. The real cursors appear to be in file_store.gen.go.	2017-08-21 16:27:23 -05:00
Stuart Carnie	25edd7bfdf	naming	2017-08-17 15:47:47 -07:00
Stuart Carnie	c86dc0d103	redundant allocation is overwritten by line 1769	2017-08-17 11:12:41 -07:00
Stuart Carnie	823f903cc6	inputs are closed if Merge returns error and use <type>FinalizerIterator * <type>FinalizerIterator sets a runtime finalizer and calls Close when garbage collected. This will ensure any associated cursors are closed and the associated TSM files released * `query.Iterators#Merge` call could return an error and the inputs would not be closed, causing a cursor leak	2017-08-17 11:12:18 -07:00
Jason Wilder	85842503be	Fix deadlock in engine/measurement fields The OnReplace func ends up trying to acquire locks on MeasurementFields. When its called via snapshotting, this can deadlock because the snapshotting goroutine also holds an RLock on the engine. If a delete measurement calls is run at the right time, it will lock the MeasurementFields and try to acquire a lock on the engine to disable compactions. This creates a deadlock. To fix this, the OnReplace callback is moved to a function param to allow only Replace calls as part of a compaction to invoke it as opposed to both snapshotting and compactions. Fixes #8713	2017-08-16 16:43:40 -06:00
Jonathan A. Sternberg	697759613c	Remove time comparisons from the inner sections of the storage engine	2017-08-16 16:51:13 -05:00
Jonathan A. Sternberg	8bd04ebe39	Remove TimeRange function and replace with a more accurate ConditionExpr function The ConditionExpr function is more accurate because it parses the condition and ensures that time conditions are actually used correctly. That means that attempting to combine conditions with OR will not result in the query silently pretending it's an AND and nested conditions work correctly so there is only one way to read the query. It also extracts the non-time conditions into a separate condition so we can stop attempting to parse around the time conditions in lower layers of the storage engine. This change does not remove those hacks, but a following commit should be able to sanitize the condition and remove them.	2017-08-16 16:45:35 -05:00
Jonathan A. Sternberg	9a2357c2c0	Separate the query engine into a separate package This change provides a clear separation between the query engine mechanics and the query language so that the language can be parsed and dealt with separate from the query engine itself.	2017-08-16 13:38:43 -05:00
Stuart Carnie	3caeee8a24	fix: cursor leak when cur == nil and aux or conds is not empty	2017-08-16 09:17:20 -07:00
Ben Johnson	e0d8cb0ef3	Cardinality AST, parser, & rewriter fixes.	2017-08-16 09:27:29 -06:00
Ben Johnson	60ab1282ea	Refactor system iterators. Previously pseudo iterators could be created for meta data such as series, measurement, and tag data. These iterators were created at a higher level and lacked a lot of the power of the query engine. This commit moves system iterators down to the series level and supports the following: - _name - _seriesKey - _tagKey - _tagValue - _fieldKey These can be used as normal fields such as: SELECT _seriesKey FROM cpu This will return all the series keys for `cpu`.	2017-08-16 09:27:29 -06:00
Ben Johnson	c9b5d60753	Parse SHOW CARDINALITY.	2017-08-16 09:27:15 -06:00
Ben Johnson	c4e2ba25c3	Merge pull request #8669 from benbjohnson/1392-tsi-index-migration TSI Index Migration Tool	2017-08-16 09:16:03 -06:00
David Norton	1d8d739418	fix #8677 : check for snapshot size == 0	2017-08-16 09:43:56 -04:00
Jason Wilder	186e44d227	Merge pull request #8702 from influxdata/jw-monitor-cpu Reduce CPU usage when checking series cardinality	2017-08-15 16:02:17 -06:00
Jason Wilder	c74932de94	Limit shard cardinality checks to 1 per database The tag cardinality checks were run for all inmem shards. Since inmem shards share the same index, a lot of the work is redundant. Inmem shards also need to sort their measurmenet and tag keys which can be CPU intensive with many shards or higher cardinality. This changes the monitoring to just check one shard in each database which should lower CPU usage due to excessive sorting. The longer term solution is to use TSI which would not have this check or required sorting.	2017-08-15 12:17:18 -06:00
Ben Johnson	06bc3b6fbf	TSI Index Migration	2017-08-15 11:40:24 -06:00
Jason Wilder	90e2cadeb6	Fix drop measurement not dropping all data If there were multiple shards, drop measurement could update the index and remove the measurement before the other shards ran their deletes. This causes the later shards to not see any series to delete. The fix is to all deleteSeries to handle the index delete which already accounts for removing the measurement when it is fully removed from the index.	2017-08-15 11:19:45 -06:00
Jason Wilder	61b13eb12b	Fix partiallyRead logic The partiallyRead func didn't account for the initial values and would return true for blocks that had not been read at all. This causes a slower path during compactions that forces a block to be decoded when it could just be merged as is without decoded. This causes compactions to consume more CPU and run slower at times.	2017-08-14 16:44:32 -06:00
Edd Robinson	45969ef3c6	Allow tag filtering when using DELETE with tsi1	2017-08-14 19:09:36 +01:00
Joe LeGasse	1121b69a9e	auth: apply FGA to SHOW SERIES	2017-08-09 14:56:53 -04:00
Edd Robinson	0f648e5170	Remove unsafe shenanigans	2017-08-03 16:38:05 +01:00
Edd Robinson	2d57b599e9	Remove debugging statement	2017-08-02 17:24:00 +01:00
Edd Robinson	da676a79ae	Implement TSI iterator	2017-08-02 16:29:14 +01:00
Edd Robinson	befae864bd	Add tests for merge function	2017-08-02 14:10:52 +01:00
Edd Robinson	aa7095be5a	Use a merge-based approach for TagValues	2017-08-02 14:10:52 +01:00
Jason Wilder	94a48774b7	Pull in new index filter	2017-08-02 14:10:52 +01:00
Edd Robinson	1e9ce8e0a7	Add test for TagValues	2017-08-02 14:10:52 +01:00
Stuart Carnie	5449285c4c	Merge pull request #8652 from influxdata/sgc-literal-cursor Reduce allocations using nil cursors and literal value cursors	2017-08-01 10:20:24 -07:00
Jason Wilder	173276a409	Remove unused filestore reference Reduces cursor struct size from 119 bytes to 111.	2017-08-01 09:41:16 -06:00
Stuart Carnie	ff65f0f24d	Reduce allocations using nil cursors and literal value cursors ``` benchmark old ns/op new ns/op delta BenchmarkIntegerIterator_Next-8 82.8 22.7 -72.58% benchmark old allocs new allocs delta BenchmarkIntegerIterator_Next-8 3 0 -100.00% benchmark old bytes new bytes delta BenchmarkIntegerIterator_Next-8 32 0 -100.00% ```	2017-07-30 09:15:34 -07:00
Jason Wilder	3d12c62121	Avoid repeatedly growning decoded values slices	2017-07-28 11:00:56 -06:00
Jason Wilder	778000435a	Conver all keys from string to []byte in TSM engine This switches all the interfaces that take string series key to take a []byte. This eliminates many small allocations where we convert between to two repeatedly. Eventually, this change should propogate futher up the stack.	2017-07-28 11:00:50 -06:00
Jason Wilder	8009da0187	Remove some extra cursor buffers that are not needed	2017-07-28 10:53:07 -06:00
Jason Wilder	6582caa78b	Reduce allocations when creating KeyCursors The refs map was to increment the file references one time each. It doesn't hurt to increment them multiple times though. We also do not need to copy the files slice as we are accessing it under a read lock so it can't be changed.	2017-07-28 10:53:07 -06:00
Jonathan A. Sternberg	f541cff2b2	Merge pull request #8646 from influxdata/js-tsdb-shard-group-tests Improve test cases in the tsdb package for the ShardGroup interface	2017-07-28 10:22:14 -05:00
Jason Wilder	85642e2831	Merge pull request #8630 from influxdata/jw-drop-oom Prevent excessive memory usage when dropping series	2017-07-27 16:24:52 -06:00
Jonathan A. Sternberg	02d9649be4	Improve test cases in the tsdb package for the ShardGroup interface tsdb.Shard and tsdb.Shards both implement tsdb.ShardGroup and neither were tested within the tsdb package itself. This adds tests for those methods which are used by the query engine.	2017-07-27 17:19:22 -05:00
Jason Wilder	6e6cc991ee	Merge pull request #8629 from influxdata/jw-compaction-abort Interrupt in progress TSM compactions	2017-07-27 16:12:40 -06:00
Jason Wilder	c75ac3076f	Limit delete to run one shard at a time There was a change to speed up deleting and dropping measurements that executed the deletes in parallel for all shards at once. #7015 When TSI was merged in #7618, the series keys passed into Shard.DeleteMeasurement were removed and were expanded lower down. This causes memory to blow up when a delete across many shards occurs as we now expand the set of series keys N times instead of just once as before. While running the deletes in parallel would be ideal, there have been a number of optimizations in the delete path that make running deletes serially pretty good. This change just limits the concurrency of the deletes which keeps memory more stable.	2017-07-27 16:01:47 -06:00
Jason Wilder	18a02d50d7	Interrupt in progress TSM compactions When snapshots and compactions are disabled, the check to see if the compaction should be aborted occurs in between writing to the next TSM file. If a large compaction is running, it might take a while for the file to be finished writing causing long delays. This now interrupts compactions while iterating over the blocks to write which allows them to abort immediately.	2017-07-27 15:58:56 -06:00
Stuart Carnie	0c79ec6f17	update xxhash and use Sum64String to avoid allocs ``` ± benchcmp ring_before.txt ring_after.txt benchmark old ns/op new ns/op delta BenchmarkRing_getPartition_100-8 108 48.1 -55.46% BenchmarkRing_getPartition_1000-8 113 48.9 -56.73% benchmark old allocs new allocs delta BenchmarkRing_getPartition_100-8 1 0 -100.00% BenchmarkRing_getPartition_1000-8 1 0 -100.00% benchmark old bytes new bytes delta BenchmarkRing_getPartition_100-8 192 0 -100.00% BenchmarkRing_getPartition_1000-8 192 0 -100.00% ```	2017-07-26 10:16:54 -07:00
Stuart Carnie	d243df5ca3	simplify loop	2017-07-24 09:03:22 -07:00
Stuart Carnie	eec80692c4	Taught tsm1 storage engine how to read and write uint64 values * introduced UnsignedValue type * leveraged existing int64 compression algorithms (RLE, Simple 8B) * tsm and WAL can read and write UnsignedValue * compaction is aware of UnsignedValue * unsigned support to model, cursors and write points NOTE: there is no support to create unsigned points, as the line protocol has not been modified.	2017-07-24 09:03:22 -07:00
Jason Wilder	4244d0e053	Merge pull request #8568 from influxdata/jw-tombstone-compress Compress tombstone files	2017-07-10 11:28:09 -06:00
Jason Wilder	c25f7b8b3f	Fix duplicate points returned after delete The sortedSeriesIds slice was not getting reset to 0 which caused the same series ids to exist in the slice more than once. Since the size of the slice never matched the size of the seriesID map, it kept appendending to the slice and sorting it which cause multiple cursor to get created for the same series. Fixes #8531	2017-07-10 10:37:01 -06:00
Stuart Carnie	2ccdda72a1	Free RLock prior to returning	2017-07-08 07:14:50 -07:00
Jason Wilder	dba3ce1a42	Merge pull request #8576 from influxdata/jw-delete-index Fix index inconsistency after deletes	2017-07-07 14:36:33 -06:00
Jason Wilder	e9370e0b86	Fix indefinite hang in WAL.writeToLog There was a race in the WAL writeToLog and scheduleSync which could lead to a writing goroutine blocking indefinitely on its syncErr channel. The issue was that the clearing of the syncCount happenend after the wal was unlock. If a goroutine was able to lock, write and call scheduleSync before the existing scheduleSync goroutine returns and ran the defer to clear the syncCount, then a new scheduleSync goroutine would not get started. This left the writing goroutine block with nothing to signal it. While in this state, a RLock on the engine was held. If a Lock was requested on the engine during this time, all future writes and queries would block waiting on the blocked wal writer. The fix is to move the atomic clearing of syncCount before the Lock is released.	2017-07-07 13:31:52 -06:00
Jason Wilder	5e11cdcdd7	Fix incorrect condition in OverlapsKeyRange The min key was not used in OverlapsKeyRange which caused it to return false when it should be true. This causes a bug where deletes would not write tombstones for files that actually contained the data it was supposed to delete.	2017-07-07 12:19:33 -06:00
Jason Wilder	839cddf6d5	Refresh index after compactions The in-memory index can get out of sync when deletes and writes to the same measurement are running concurrently. The index is updated independently from data on disk and it's possible for the index to unassign a shard when data still exists on disk. What happens is that there are TSM files on disk, but the index does not know that the series that exist in those files still are in the shard. Restarting the server reloads the index and the data is visible again. From and end user perspective, this can look like more data is deleted than should have been or that deleted data re-appears after a restart or writes to the shard occur again. There isn't an easy way to resolve this since the index and storage are not transactional resources and we cannot atomically commit or rollback changes to both at once. As a workaround, after new TSM files are installed, we refresh the index with series keys that exist in the new tsm files as well as any lingering data still in the cache. There is a small window of time when the index may be missing series, but it will re-appear after the refresh completes.	2017-07-07 12:19:30 -06:00
Jason Wilder	3e7dfad7c4	Compress tombstone files This adds a v3 format that is a gzip compressed version of the v2 format. It reduces the size of tombstone files substantially without having to support a more feature rich file format for tombstones.	2017-07-06 10:10:31 -06:00
Jason Wilder	9ac042b5cd	Reduce lock contention when disabling compactions The monitor goroutine calls enable compactions every 10s to spin down (or start up) goroutines for cold shards. This frequent Lock may be causing lock contention for writes and queries which get blocked trying to acquire an RLock. The go RWMutex says that new RLock calls will block if there is a pending Lock call that is blocked. Switching the common path to use an RLock should avoid the Lock and reduce lock contention for writes and queries.	2017-07-05 15:42:21 -06:00
Edd Robinson	101af89987	Update CHANGELOG	2017-07-05 16:35:41 +01:00
Edd Robinson	0748d28986	Ensure tmp files cleaned up when compaction disabled	2017-07-04 20:04:23 +01:00
Ben Johnson	9e64813db8	Defer unlock all write locks in inmem index. Currently two write locks in `inmem` are obtained and then manually unlocked at function exit points. However, we have reports that the `inmem` index is hanging on a write lock and cannot track the issue down to anything else besides a lock that could have been left unlocked because of a panic. This commit changes the two locks to always defer their unlocks to prevent these hangs.	2017-06-29 10:23:13 -06:00
Ben Johnson	f9dc61928a	Fix TSI issue with spaces in tag values.	2017-06-28 11:39:48 -06:00
Jason Wilder	9bd703d597	Fix possible deadlocks in inmem index	2017-06-21 12:07:40 -06:00
Jason Wilder	77afe50f7e	Fix panic in ForEachMeasurementTagKey If a shard was closed, ForEachMeasurementTagKey and TagKeyCardinality would panic because the engine was nil.	2017-06-13 12:04:32 -06:00
Ben Johnson	b51f604030	Fix TSI non-contiguous compaction panic. This fixes the case where log files are compacted out of order and cause non-contiguous sets of index files to be compacted. Previously, the compaction planner would fetch a list of index files for each level and compact them in order starting with the oldest ones. This can be a problem for level 1 because level 0 (log files) are compacted individually and in some cases a log file can finish compacting before older log files are finished compacting. This causes there to be a gap in the list of level 1 files that is ignored when fetching a list of index files. Now, the planner reads the list of index files starting from the oldest but stops once it hits a log file. This prevents that gap from being ignored.	2017-06-13 10:53:26 -06:00
Summer	d17c205b54	fix typo	2017-06-12 11:20:08 +08:00
marchtea	6e6f92c99a	fix index file fd leak	2017-06-12 10:58:05 +08:00
Ben Johnson	bcc6ef769b	Check file count before attempting a TSI level compaction. This check was previously in a different section of code which was lost during a refactor to the new compaction strategy. The compaction planning now makes a check to ensure at least two files are available for compaction in a level.	2017-06-06 11:08:59 -06:00
Ben Johnson	3128c6a42e	Fix SHOW TAG VALUES deduplication.	2017-06-01 15:38:35 -06:00
Stuart Carnie	47f97ea134	use parsed measurement and models.Tags	2017-05-26 13:21:59 -07:00
Stuart Carnie	3ec9b401f7	fix benchmark test	2017-05-26 13:21:59 -07:00
Stuart Carnie	46796d932f	add database to index, engine and shard; call AuthorizeSeriesRead	2017-05-26 13:21:50 -07:00
Joe LeGasse	815f740f4c	initial fga work wip wip fix tests / build	2017-05-26 13:16:27 -07:00
Stuart Carnie	c30c33dbcb	Merge remote-tracking branch 'origin/master' into sgc-tagsets	2017-05-26 09:10:18 -07:00
Stuart Carnie	c89d98dc02	gofmt	2017-05-25 16:00:23 -07:00
Stuart Carnie	386720b2e7	improvements to inmem/Measurement.TagSets API ``` benchmark old ns/op new ns/op delta BenchmarkMeasurement_TagSetsNoDimensions_1000-8 234054 117315 -49.88% BenchmarkMeasurement_TagSetsDimensions_1000-8 996838 313313 -68.57% BenchmarkMeasurement_TagSetsNoDimensions_100000-8 58940464 39452117 -33.06% BenchmarkMeasurement_TagSetsDimensions_100000-8 175612060 70195562 -60.03% benchmark old allocs new allocs delta BenchmarkMeasurement_TagSetsNoDimensions_1000-8 1026 26 -97.47% BenchmarkMeasurement_TagSetsDimensions_1000-8 8026 2029 -74.72% BenchmarkMeasurement_TagSetsNoDimensions_100000-8 100064 64 -99.94% BenchmarkMeasurement_TagSetsDimensions_100000-8 800064 200067 -74.99% benchmark old bytes new bytes delta BenchmarkMeasurement_TagSetsNoDimensions_1000-8 117080 69080 -41.00% BenchmarkMeasurement_TagSetsDimensions_1000-8 549081 117176 -78.66% BenchmarkMeasurement_TagSetsNoDimensions_100000-8 23298264 18498265 -20.60% BenchmarkMeasurement_TagSetsDimensions_100000-8 66498276 23298360 -64.96% ```	2017-05-25 15:52:27 -07:00
Jason Wilder	14b54e08cb	Fix compile error	2017-05-25 15:18:35 -06:00
Jason Wilder	6b594351e9	Merge pull request #8425 from influxdata/jw-max-key Fix large field keys preventing snapshot compactions	2017-05-25 12:19:59 -06:00
Ben Johnson	24446a0297	Implement zap logging in TSI.	2017-05-25 08:57:50 -06:00
Jason Wilder	208ef09f87	Prevent writing series keys that exceed max key size WriteBlock was missing the check for the max series keys which allowed series keys to be written that were larger than the 2 bytes allocated to store their length. When this occurred, the TSM can fail to load.	2017-05-24 13:41:09 -06:00
Jason Wilder	2c91eab241	Merge pull request #8420 from influxdata/jw-snap-err Compaction planning fixes	2017-05-23 13:59:48 -06:00
Ben Johnson	547db32d01	Fix tsi go vet issues.	2017-05-23 13:42:38 -06:00
Jason Wilder	29e4287fd2	Preven masking root errors when compactions are in progress The root error when creating a tmp file when writing a snapshot was hidden making it difficult to determine why snapshots were failing.	2017-05-23 12:09:36 -06:00
Jason Wilder	bd6d0681e9	Ensure planned files are released The defer was never executed because the planning happens in a long running goroutine that loops. The plans need to be released immediately after applying them.	2017-05-23 12:08:25 -06:00
Jason Wilder	4e582f297a	Fix race in findGenerations It was possible that the findGenerations could get stuck returning no files even when generations existed on disk.	2017-05-23 12:05:47 -06:00
Ben Johnson	3023052f58	Merge pull request #8290 from benbjohnson/tsi-tag-block-delta-encode TSI Compaction Refactor	2017-05-23 10:25:16 -06:00
Ben Johnson	48456d80ad	Remove tsi commented code.	2017-05-23 10:24:37 -06:00
Jason Wilder	5619946b85	Merge pull request #8416 from influxdata/jw-tsm-tmp Fix TSM tmp file lingering on disk	2017-05-23 10:12:18 -06:00
Ben Johnson	2524df3405	Convert tsi1 series keys to uint32.	2017-05-23 09:48:13 -06:00
Ben Johnson	c744e2f562	TSI pull request fixes.	2017-05-23 09:01:05 -06:00
Ben Johnson	57eeae03fc	Add note about SeriesIDs() limitation.	2017-05-23 08:42:25 -06:00
Ben Johnson	e7f39c06ab	Refactor TSI1 compaction.	2017-05-23 08:42:25 -06:00
Ben Johnson	1975940f76	intermediate compaction commit	2017-05-23 08:42:25 -06:00
Ben Johnson	79edc0979c	Add temporary debugging stats for offset lookups.	2017-05-23 08:41:31 -06:00
Ben Johnson	48a06432df	Add tsi1 bloom filter.	2017-05-23 08:41:31 -06:00
Ben Johnson	f3e08c5871	Delta encode tag and measurement block series data.	2017-05-23 08:41:31 -06:00
Ben Johnson	6f58149052	Increase tsi compaction factor.	2017-05-23 08:40:26 -06:00
Jason Wilder	1833475c09	Fix TSM tmp files leaking TMP files could leak when compactions failed for various reasons. They were also being deleted inadvertently when compactions were disabled causing other errors to be reported in the logs.	2017-05-22 14:51:18 -06:00
Stuart Carnie	5c5bea2baa	move Measurement and Series to inmem package	2017-05-19 08:17:09 -07:00
Jason Wilder	9445ccbad3	Expose shard meta info on Shard	2017-05-16 11:18:02 -06:00
Stuart Carnie	c863923e68	cache MarshalSize	2017-05-12 14:05:25 -06:00
Stuart Carnie	0151afe31c	check size and allocate once	2017-05-12 14:05:25 -06:00
Stuart Carnie	096d6f65b4	explicit sizes	2017-05-12 14:05:24 -06:00
Jason Wilder	4d002bb370	Limit concurrent compactions within a shard This changes full compactions within a shard to run sequentially instead of running all the compaction groups in parallel. Normally, there is only 1 full compaction group to run. At times, there could be several which causes instability if they are all running concurrently as they tie up a cpu for long periods of time. Level compactions are also capped to a max of 4 concurrently running for each level in a shard. This prevents sudden spikes in CPU and disk usage due to a large backlog of tsm files at a given level.	2017-05-12 14:05:24 -06:00
Jason Wilder	2cac46ebbc	Convert usage of strings to []byte Measurement name and field were converted between []byte and string repetively causing lots of garbage. This switches the code to use []byte in the write path.	2017-05-12 14:05:19 -06:00
Jason Wilder	503d41a08f	Add LimitedBytePool for wal buffers This pool was previously a pool.Bytes to avoid repetitive allocations. It was recently switchted to a sync.Pool because pool.Bytes held onto very larger buffers at times which were never released. sync.Pool is showing up in allocation profiles quite frequently. This switches the pool to a new pool that limits how many buffers are in the pool as well as the max size of each buffer in the pool. This provides better bounds on allocations.	2017-05-11 11:27:00 -06:00
Jason Wilder	e17be9f4ba	Merge pull request #8377 from influxdata/jw-encoders Speed up time encoding/decoding	2017-05-11 10:38:27 -06:00
Joe LeGasse	087d9f4670	tsm: fixed test to not require sorted backup tarball	2017-05-11 12:00:19 -04:00
Jason Wilder	b150a6293c	Merge pull request #8380 from influxdata/jw-wal-buffer Use buffer writer for wal segments	2017-05-11 08:34:44 -06:00
Jason Wilder	b81ac21bcb	Merge pull request #8378 from influxdata/jw-snapshot-disable Don't disable snapshots when snapshot compactions are disabled	2017-05-10 12:00:27 -06:00
Jason Wilder	e102fcca9c	Use buffer writer for wal segments	2017-05-10 11:42:32 -06:00
Jason Wilder	39a829c1ae	Speed up time encoding/decoding This speeds up time encoding and decoding by skipping the divisor scaling if scaling by 1. Since division and multiplication are expensive cpu and scaling by 1 has no effect, this just slows encoding and decoding down.	2017-05-10 11:12:35 -06:00
Jason Wilder	4e3e707abc	Fix packed time encoded benchmark	2017-05-10 10:35:44 -06:00
Jason Wilder	e6f31c38b5	Merge pull request #8372 from influxdata/jw-tombstone-range Fix deletes triggering unnecessary compactions	2017-05-08 16:52:59 -06:00
Jason Wilder	29c2b1958e	Fix deletes triggering unnecessary compactions Tombstone files would be written to all TSM files even if the deleted keys or timerange did not exist in the TSM file. This had the side effect of causing shards to get recompacted back to the same state. If any shards or large numbers of TSM files existed, disk usage and CPU utilization would spike causing issues. This prevents tombstones being written for TSM files that could not possiby contain the series keys being deleted or if the delted time range is outside the range of the file.	2017-05-08 14:52:28 -06:00
Jason Wilder	9374c4f513	Reduce allocations when monitoring shards When monitoring shards, a slice of measurements is allocated for each shard. With many shards and measurements, these allocations can be large. Since inmem shards share the same index, we only need to do this once since the resulting slices are all the same. This reduces memory usage when monitoring shard cardinality.	2017-05-08 13:34:40 -06:00
Jason Wilder	00bdf62b83	Make shard is ready before returning index type Shard can be created before they are opened and not have an index setup yet. This can cause a panic if IndexType is called.	2017-05-08 12:48:35 -06:00
Jason Wilder	041262af0e	Fix race in shard engine was accessed outside of an RLock which can cause a race when montitoring goroutines access the shard while it's closed/closing.	2017-05-08 12:37:18 -06:00
Ben Johnson	489c89bea4	Add tsi support tooling.	2017-05-08 11:00:15 -06:00
Jason Wilder	c0c6ad6880	Don't disable snapshots when snapshot compactions are disabled Snapshot compactions can be disabled independently of snapshotting capability. This prevents taking backups of shards that have compactions disabled.	2017-05-05 14:15:45 -06:00
Jason Wilder	73ddd4787b	Fix race in SeriesN and CreateSeriesIfNotExists	2017-05-04 14:40:50 -06:00
Jason Wilder	fc34d30038	Uses SeriesN instead of copying sketches Avoids some extra allocations.	2017-05-04 10:12:38 -06:00
Jason Wilder	bc639c5982	Make disableLevelCompactions lighter weight Since this is called more frequently now, the cleanup func was invoked quite a bit which makes several syscalls per shard. This should only be called the first time compactions are disabled.	2017-05-04 09:56:15 -06:00
Jason Wilder	7371f1067b	Fix deadlock in Index.ForEachMeasurementTagKey Index.ForEachMeasurementTagKey held an RLock while call the fn, if the fn made another call into the index which acquired an RLock and after another goroutine tried to acquire a Lock, it would deadlock.	2017-05-03 22:48:10 -06:00
Jason Wilder	b4ea523910	Include snapshot size in the total cache size This was causing a shard to appear idle when in fact a snapshot compaction was running. If the time was write, the compactions would be disabled and the snapshot compaction would be aborted.	2017-05-03 16:31:58 -06:00
Jason Wilder	88848a9426	Remove per shard monitor goroutine The monitor goroutine ran for each shard and updated disk stats as well as logged cardinality warnings. This goroutine has been removed by making the disks stats more lightweight and callable direclty from Statisics and move the logging to the tsdb.Store. The latter allows one goroutine to handle all shards.	2017-05-03 16:31:57 -06:00
Jason Wilder	f87fd7c7ed	Stop background compaction goroutines when shard is cold Each shard has a number of goroutines for compacting different levels of TSM files. When a shard goes cold and is fully compacted, these goroutines are still running. This change will stop background shard goroutines when the shard goes cold and start them back up if new writes arrive.	2017-05-03 16:31:57 -06:00
Jason Wilder	3d1c0cd981	Don't return compaction plans for files already part of a plan The compactor prevents the same file from being compacted by different compaction runs, but it can result in warning errors in the logs that are confusing. This adds compaction plan tracking to the planner so that files are only part of one plan at a given time.	2017-05-03 16:31:57 -06:00
Jason Wilder	8fc9853ed8	Add max-concurrent-compactions limit This limit allows the number of concurrent level and full compactions to be throttled. Snapshot compactions are not affected by this limit as then need to run continously. This limit can be used to control how much CPU is consumed by compactions. The default is to limit to the number of CPU available.	2017-05-03 16:31:57 -06:00
Jason Wilder	80fef4af4a	Enable shards after loading Compactions are enabled as soon as the shard is opened. This can slow down startup or cause the system to spike in CPU usage at startup if many shards need to be compacted. This now delays compactions until after they are loaded.	2017-05-03 16:31:57 -06:00
Jason Wilder	02e22f4a00	Fix deadlock in Measurement The lazy sorting of series caused a deadlock since it can not take a Lock when a caller may have already acquired an RLock. filters should be called w/o any locks as the function already acquires locks as needed.	2017-05-03 13:49:56 -06:00
Jason Wilder	3c130cd39c	Expose TSMWriter.Flush Allows flushing the writer so we don't always need to close and re-open the file handle.	2017-04-28 14:00:50 -06:00
Jason Wilder	141f0d71cd	Update index when import files	2017-04-28 14:00:45 -06:00
Jason Wilder	a76146e34a	Add Store.Import capability This allows the contents of a backup to be imported into a shard without requiring the whole shard to be replaced.	2017-04-28 13:30:46 -06:00
Jason Wilder	3839fe34ea	Remove FileStore.Add/Remove Can use Replace which handles files in-use and stats correctly.	2017-04-28 13:20:55 -06:00
Jason Wilder	137d0c0d09	Rename WAL.WritePoints to WAL.WriteMulti To match Cache.WriteMulti	2017-04-28 13:20:55 -06:00
Jason Wilder	28422f2fec	Use consistent receiver var name for Value types	2017-04-28 13:20:55 -06:00
Jason Wilder	1bc4936336	Export Reader.ReadBytes	2017-04-28 13:20:55 -06:00
Stuart Carnie	b2d2976466	update reason messages	2017-04-28 11:21:57 -07:00
Stuart Carnie	8097e817f6	prefix partial write errors with `partial write:` NOTE: parser errors (via http API) are also transformed into PartialWriteError	2017-04-28 11:00:14 -07:00
Ben Johnson	aa64c908d0	Merge pull request #8314 from benbjohnson/tsi-doc Add TSI documentation	2017-04-24 10:58:31 -06:00
Ben Johnson	ba7108f94e	Add TSI documentation.	2017-04-21 14:45:03 -06:00
Jason Wilder	d88604f6f2	Move repetive loop checks outside of values loop	2017-04-20 13:45:04 -06:00
Jason Wilder	888689f5d3	Move values loop under type switch All the values read must be of the same type so repeatedly using the type switch is confusing and less efficiient.	2017-04-20 13:39:49 -06:00
Jason Wilder	b0988511bf	Use fixed size array instead of slice	2017-04-20 13:38:33 -06:00
Jason Wilder	da6bdfdda8	Use bufio.Reader when reading wal segments Reduces disk IO due to small reads.	2017-04-20 13:33:42 -06:00
Jason Wilder	8e9cbd7ffc	Simplify WALSegmentReader.UnmarshalBinary There were two loops over nvals which created some extra allocation which coudl be replaced with a simplet slice capacity and append.	2017-04-20 13:33:42 -06:00
Jason Wilder	02b663b651	Fix lock contention in Index.CreateSeriesListIfNotExists There was contention on the write lock which only needs to be acquired when checking to see if the log file should be rolled over.	2017-04-20 12:28:42 -06:00
Jason Wilder	40ec85aacd	Fix lock contention in LogFile.SeriesWithBuffer Under high write load, the check for each series was done sequentially which caused a lot of CPU time to acquire/release the RLock on LogFile. This switches the code to check multiple series at once under an RLock similar to the chang for inmem.	2017-04-20 12:28:42 -06:00
Jason Wilder	0e715b5b74	Reduce lock contention on MeasurementFields	2017-04-20 12:28:42 -06:00
Jason Wilder	ef65ee77f4	Switch WAL byte pools to sync/pool The current bytes.Pool will hold onto byte slices indefinitely. Large writes can cause the pool to hold onto very large buffers over time. Testing w/ sync/pool seems to perform similarly now so using a sync/pool will allow these buffers to be GC'd when necessary.	2017-04-20 12:28:42 -06:00
Jason Wilder	d155d37ca8	Reduce TSM write buffer When many TSM files are being compacted, the buffers can add up fairly quickly.	2017-04-20 12:28:42 -06:00
Jason Wilder	3c2825a851	Reduce lock thrashing when checking series The inmem index would call CreateSeriesIfNotExist for each series which takes and releases and RLock to see if a series exists. Under high write load, the lock shows up in profiles quite a bit. This adds a filtering step that obtains a single RLock and checks all the series and returns the non-existent series to contine though the slow path.	2017-04-20 12:28:41 -06:00
Jason Wilder	d7c5dd0a3e	Reduce wal sync goroutine churn Under high write load, the sync goroutine would startup, and end very frequently. Starting a new goroutine so frequently adds a small amount of latency which causes writes to take long and sometimes timeout. This changes the goroutine to loop until there are no more waiters which reduce the churn and latency.	2017-04-20 12:28:34 -06:00
Jason Wilder	aa9925621b	Fix deadlock in wal If the sync waiters channel was full, it would block sending to the channel while holding a the wal write lock. The sync goroutine would then be stuck acquiring the write lock and could not drain the channel. This increases the buffer to 1024 which would require a very high write load to fill as well as retuns and error if the channel is full to prevent the blocking.	2017-04-19 11:33:13 -06:00
Jason Wilder	a19ce9c10f	Reduce index lock contention Series and Measurment have their own locks and we do not need to hold locks on the index while using those types.	2017-04-18 16:32:33 -06:00
Jason Wilder	883b3dcbbb	Reduce lock content in AssignShard The lock shows up under write load. It only needs to be assigned once so a read lock eliminates the contention.	2017-04-18 16:32:33 -06:00
Jason Wilder	5c51ae7319	Merge branch '1.2' into jw-merge-123	2017-04-14 14:36:54 -06:00
Jason Wilder	ff1270dfeb	Fix dropping fields created data corruption The Point is intended to be immutable after being parsed since it is shared by several goroutines. When dropping a field (e.g. time), corrupted data can result if one goroutine is delete the field while another is marshaling the underlying byte slices. To avoid this, the shard will just skip invalid fields and series instead of trying to mutate them by deleting them.	2017-04-07 12:58:42 -06:00
Jason Wilder	1de99cd219	Merge pull request #8268 from influxdata/jw-dedup-measurements Ensure MeasurementNames deduplicates measurements across shards	2017-04-06 13:10:56 -06:00
Jason Wilder	927acb5ab9	Ensure MeasurementNames deduplicates measurements across shards	2017-04-06 12:17:29 -06:00
Jason Wilder	cf100647e0	Fix deadlock in Measurement.SeriesIDsAllOrByExpr SeriesIDsAllOrByExpr took a RLock and ended up calling SeriesIDs which can take a Lock causing a deadlock.	2017-04-05 16:22:45 -06:00
Ben Johnson	9c97cd8601	Merge remote-tracking branch 'upstream/master' into tsi	2017-04-04 12:46:09 -06:00
Ben Johnson	0d74497abe	Reset rhh map elements to reuse allocations.	2017-04-04 11:57:37 -06:00
Ben Johnson	6ff27c95e5	Fix tsi assertions.	2017-04-04 11:29:21 -06:00
Ben Johnson	dbc10559c4	Merge pull request #8247 from benbjohnson/tsi-series-block-partitioning TSI Series Block Partitioning	2017-04-04 11:15:10 -06:00
Jason Wilder	5fa8073fc2	Merge branch '1.2' into jw-merge-123	2017-04-04 11:12:06 -06:00
Jason Wilder	84cbee227a	Fix file store not close all TSM files Regression added via #8192	2017-04-04 10:58:51 -06:00
Ben Johnson	95d4016ff2	Merge branch 'tsi' of https://github.com/influxdata/influxdb into tsi-series-block-partitioning	2017-04-04 10:14:03 -06:00
Jason Wilder	ec7eea2a0f	Skip tests on appveyor that we skip w/ -race	2017-04-04 09:47:59 -06:00
Ben Johnson	bf49b176f5	Partition tsi1 series index.	2017-04-04 09:46:04 -06:00
Jason Wilder	793635dbd7	Skip TSI cardinality tests on appveyor	2017-04-04 09:19:43 -06:00
Jason Wilder	4f850b5cff	Skip TestCache_Deduplicate_Concurrent on windows	2017-04-04 08:48:55 -06:00
Jason Wilder	7ac3c9a26f	Remove unused cardinality func	2017-04-03 11:24:55 -06:00
Jason Wilder	fcdc3c5c21	Remove commented out code in meta	2017-04-03 11:22:59 -06:00
Jason Wilder	8da84e6144	Merge branch 'master' into tsi	2017-04-03 11:21:02 -06:00
Jason Wilder	68f73e64d1	Lazily sort Measurement.SeriesIDs Removing series while trying to maintain the sorted series list does not perform well when removing many series. This causes drop DB, RP, series, to be very slow in some cases. Instead, lazily create a sorted series list when first requested and invalidate it when dropping series.	2017-04-03 08:57:53 -06:00
Jason Wilder	32c4d43952	Speed up drop measurement This reworks drop measurement to use a sorted list of series keys instead of creating an intermediate map. It remove allocations and some extra garbage that is created during drop measurement.	2017-04-03 08:57:53 -06:00
Jason Wilder	a78da51b7c	Use buffered writer when writing tombstones When deleting many series, the many small writes flood the disks and consume a lot of CPU time.	2017-04-03 08:57:52 -06:00
Jason Wilder	6232d5e56d	Remove defer allocations in TSMReader	2017-04-03 08:57:52 -06:00
Jason Wilder	920c8396c6	Use sorted merge in FileStore.WalkKeys WalkKeys serially walked each TSM file and invoked fn for each key. Caller needed to handle duplicate calls to fn with the same key because the same key could exist in multiple TSM files. The serial execution was also slower. Since the series keys are already sorted, we can iterate over all files in parallel and skip duplicates using a sorted merge. This fixes the duplicate invocation issue as well as speeds up walking all keys. This can significant improve startup performance when many TSM files exists that may not have been fully compacted. This also has benefits for deletes (measurements/series) since duplicates are removed saving extra allocations and work. This may also allow for the optimize compaction to be removed provided startup times are fast enough.	2017-04-03 08:57:52 -06:00
Edd Robinson	5e342a2ddd	Ensure shared index removed on database drop When using the inmem index, if one drops a database, and then creates it again, the previous index object will be reused. This includes the previous cardinality estimation sketches, leading to inaccurate cardinality estimations.	2017-03-30 13:05:31 +01:00
Edd Robinson	ddf7f0fd7b	Remove uncalled method	2017-03-30 12:48:22 +01:00
Edd Robinson	fddaff2cc8	Merge master in	2017-03-29 18:00:28 +01:00
Edd Robinson	116230b427	Use varint for tag count	2017-03-29 16:31:13 +01:00
Edd Robinson	45f843fc91	Don't unassign shards when system shutting down	2017-03-29 11:57:38 +01:00
Ben Johnson	2edfb1c92d	Ignore series limit on database load.	2017-03-24 16:27:16 -06:00

... 4 5 6 7 8 ...

2142 Commits (938db68198718d9899a91e00e3a598e85174320a)