influxdb

Commit Graph

Author	SHA1	Message	Date
Stuart Carnie	baa05de3f8	add benchmarks	2017-09-19 11:47:48 -07:00
Stuart Carnie	cfc6a1cd9f	implement optimization for Include function ``` benchmark old ns/op new ns/op delta BenchmarkIntegerValues_IncludeNone_1000-8 651 6.69 -98.97% BenchmarkIntegerValues_IncludeMiddleHalf_1000-8 1131 114 -89.92% BenchmarkIntegerValues_IncludeFirst_1000-8 638 33.9 -94.69% BenchmarkIntegerValues_IncludeLast_1000-8 1269 32.2 -97.46% BenchmarkIntegerValues_IncludeNone_10000-8 7751 6.76 -99.91% BenchmarkIntegerValues_IncludeMiddleHalf_10000-8 11582 1378 -88.10% BenchmarkIntegerValues_IncludeFirst_10000-8 7911 43.8 -99.45% BenchmarkIntegerValues_IncludeLast_10000-8 12442 38.4 -99.69% ``` (cherry picked from commit fb93ad5)	2017-09-19 09:53:28 -07:00
Stuart Carnie	ca40c1ad3c	<type>Values.Exclude function uses binary search and copy builtin ``` ± benchcmp old.txt new.txt benchmark old ns/op new ns/op delta BenchmarkIntegerValues_ExcludeNone_1000-8 1285 7.34 -99.43% BenchmarkIntegerValues_ExcludeMiddleHalf_1000-8 1258 148 -88.24% BenchmarkIntegerValues_ExcludeFirst_1000-8 1268 7.51 -99.41% BenchmarkIntegerValues_ExcludeLast_1000-8 1125 27.7 -97.54% BenchmarkIntegerValues_ExcludeNone_10000-8 12665 7.31 -99.94% BenchmarkIntegerValues_ExcludeMiddleHalf_10000-8 12039 976 -91.89% BenchmarkIntegerValues_ExcludeFirst_10000-8 12663 7.29 -99.94% BenchmarkIntegerValues_ExcludeLast_10000-8 10990 34.9 -99.68% ``` (cherry picked from commit d7a3c23)	2017-09-19 09:53:26 -07:00
Jason Wilder	940da04a34	Merge pull request #8829 from influxdata/jw-mmap Release mmap pages when shard is cold	2017-09-18 12:08:37 -06:00
Jason Wilder	31646aae3a	Release mmap pages when shard is cold This instructs the kernel that it can release memory used by mmap'd TSM files when they are not actively being used. It the mappings are use, the kernel will fault the pages back in. On linux, this causes RES memory to drop immediately when run.	2017-09-18 11:51:51 -06:00
Edd Robinson	e39de3e427	Merge pull request #8782 from oiooj/pr-shard-fix Correctly check if the Shard is ready for queries or writes	2017-09-18 18:17:19 +01:00
Jonathan A. Sternberg	2228b91b0d	Unsigned data type parsing and prioritization	2017-09-14 12:28:13 -05:00
Jason Wilder	7d467c2047	Fix windows unmapping of anonymous index slice	2017-09-12 10:30:10 -06:00
Jason Wilder	b4b3c159cc	Fixup rebase	2017-09-11 17:04:10 -06:00
Jason Wilder	d5d9f9acfe	Remove debug line	2017-09-11 15:31:28 -06:00
Jason Wilder	26f92ce6ac	Remove commented out code	2017-09-11 15:30:05 -06:00
Jason Wilder	820856347c	Don't use disk temp file for snapshots	2017-09-11 15:29:26 -06:00
Jason Wilder	4ed9c75896	Fix unmapping anonymous memory slice	2017-09-11 15:29:26 -06:00
Jason Wilder	97f7857715	Remove mutex on TSMWriter This isn't used by more than one goroutine so locks are unnecessary.	2017-09-11 15:29:26 -06:00
Jason Wilder	a93a5e9bdf	Include the size of the key in the cache size	2017-09-11 15:29:26 -06:00
Jason Wilder	38460ec37e	Re-enable compactions during writes A cold shard that suddenly receives a lot of writes could get a very big cache that takes a long time to snapshot or causes the cache max memory limit to be hit more quickly. This re-enables the compactions if necessary during writes so we don't have to wait for the shard monitor goroutine to re-enable them.	2017-09-11 15:29:26 -06:00
Ben Johnson	ee4d3c7b3d	Invalidate all bloom filters.	2017-09-11 15:29:26 -06:00
Ben Johnson	3c2487b97a	Clean up tsi bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Ben Johnson	6af936ee61	Fix bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Ben Johnson	a40b2bb210	Simplify bloom filter invalidation.	2017-09-11 15:29:26 -06:00
Edd Robinson	408a78d904	Increase size of SeriesBlock partition	2017-09-11 15:29:26 -06:00
Jason Wilder	7388eb9499	Use disk when writing TSM index	2017-09-11 15:29:25 -06:00
Ben Johnson	0ec2736f23	Incrementally rebuild tsi bloom filters.	2017-09-11 15:29:25 -06:00
Jason Wilder	a5a2957567	Reduce allocation in log_file	2017-09-11 15:29:25 -06:00
Jason Wilder	d3e832b462	Use offheap memory for indirect index offsets slice	2017-09-11 15:29:25 -06:00
Jason Wilder	91eb9de341	Use existing TSMReader from file store during compactions Compactions would create their own TSMReaders for simplicity. With very high cardinality compactions, creating the reader and indirectIndex can start to use a significant amount of memory. This changes the compactions to use a reader that is already allocated and managed by the FileStore.	2017-09-11 15:29:25 -06:00
Jason Wilder	739ecd2ebd	Fix a compaction planning bug There was a race where the plan returned was for files that were just compacted so the compaction would immediately abort.	2017-09-11 15:26:25 -06:00
Jason Wilder	bc4fb0ea10	Sort index entries if necessary These are already sorted during compaction, so switch to sorting lazily to avoid the CPU and allocations. This would only occur when using if using the writer directly.	2017-09-11 15:26:25 -06:00
Jason Wilder	a9e89ede75	Reduce lock contenton on Index Stat and Size are read-only and can take an RLock.	2017-09-11 15:26:25 -06:00
Jason Wilder	f18dec6a4a	Use sorted slice for writing TSM index The directIndex used by the TSMWriter maintained a map of series keys to index entries. When the index is written to the TSM file, the keys are sorted and then written out in order. The reason for this is because directIndex used to be the only index and it was optimized more for reading. The reading has been replaced by the indirectIndex so the map of keys ends up wasting space. During compactions, the series keys (and index entries) are already sorted so this change uses the sorting to avoid the map and sort when writing the index. This reduces allocations and CPU usage quite a bit for larger cardinality TSM files.	2017-09-11 15:26:24 -06:00
Jason Wilder	2a0d7935d7	Switch level 3 compactions to use fast compaction strategy This leaves the slower compactions that create full blocks to only the full compaction. This helps reduce CPU usage and memory while shards are hot, but increases disk usage (reduced compression) slightly.	2017-09-11 15:26:24 -06:00
Jason Wilder	94e229ff59	Merge branch 'master' into jw-drop-series	2017-09-08 15:34:32 -06:00
Jason Wilder	44e1d3f185	Merge pull request #8804 from influxdata/jw-wal-oom Fix increased memory usage in cache and wal	2017-09-08 15:10:53 -06:00
Jason Wilder	78922f9821	Set rc to nil when closing WALSegmentReader	2017-09-08 14:55:02 -06:00
Joe LeGasse	4fb35b373b	auth: apply series auth to TSI	2017-09-08 09:09:53 -04:00
Jason Wilder	b9b648e2a0	Dynamically allocate cache store The cache store can be memory intensive with many shards. This lazyily allocates it when needed and frees it when the cache is empty and cold.	2017-09-07 16:35:08 -06:00
Jason Wilder	5581f8b4ae	Re-use WALSegmentReaders at startup	2017-09-07 12:56:17 -06:00
Jason Wilder	e39276b96f	Skip reading 0 byte wal segments	2017-09-07 12:24:54 -06:00
Jason Wilder	a8d9eeef36	Reduce lock contention when deleting high cardinality series Deleting high cardinality series could take a very long time, cause write timeouts as well as dead lock the process. This fixes these issue to by changing the approach for cleaning up the indexes and reducing lock contention. The prior approach delete each series and updated every index (inmem) during the delete. This was very slow and cause the index to be locked while it items in a slice were removed one by one. This has been changed to mark series as deleted and then rebuild the index asynchronously which speeds up the process. There was also a dead lock that could occur when deleing the field set. Deleting the field set held a write lock and the function it invoked under the lock could try to take a read lock on the field set. This would then deadlock. This approach was also very slow and caused time out for writes. It now uses faster approach that checks for the existing of the measurment in the cache and filestore which does not take write locks.	2017-09-07 11:36:02 -06:00
Jonathan A. Sternberg	e18425757d	Merge pull request #8791 from influxdata/js-explain-cached-values Include the number of scanned cached values in the iterator cost	2017-09-06 16:00:30 -05:00
Jonathan A. Sternberg	590be193e5	Include the number of scanned cached values in the iterator cost	2017-09-06 15:41:07 -05:00
Stuart Carnie	4a6114028c	exported UnloadIndex checks for ready state	2017-09-05 11:22:13 -07:00
kun	8a283e248c	Correctly check if the Shard is ready for queries or writes	2017-09-03 15:14:58 +08:00
Jonathan A. Sternberg	091ea5f9a5	Merge pull request #8776 from influxdata/js-explain-plan Initial implementation of explain plan	2017-09-01 16:19:37 -05:00
Edd Robinson	51e886ba66	Merge pull request #8757 from oiooj/pr-cl Fix panic when the engine already closed in a shard	2017-09-01 16:59:12 +01:00
Jonathan A. Sternberg	50d404e690	Initial implementation of explain plan It prints the statistics of each iterator that will access the storage engine. For each access of the storage engine, it will print the number of shards that will potentially be accessed, the number of files that may be accessed, the number of series that will be created, the number of blocks, and the size of those blocks.	2017-09-01 09:01:10 -05:00
Jonathan A. Sternberg	466fc9026e	Reduce how long it takes to walk the varrefs in an expression This is used quite a bit to determine which fields are needed in a condition. When the condition gets large, the memory usage begins to slow it down considerably and it doesn't take care of duplicates.	2017-08-31 09:33:45 -05:00
Joe LeGasse	732a0c2eaa	Merge pull request #8769 from influxdata/jl-map-cleanup cleanup: remove poor usage of ',ok' with maps	2017-08-31 09:18:42 -04:00
Ben Johnson	1dbe0662d8	Use system cursors for measurement, series, and tag key meta queries.	2017-08-30 08:35:20 -06:00
Joe LeGasse	a95647b720	cleanup: remove poor usage of ',ok' with maps There are several places in the code where comma-ok map retrieval was being used poorly. Some were benign, like checking existence before issuing an unconditional delete with no cleanup. Others were potentially far more serious: assuming that if 'ok' was true, then the resulting pointer retrieved from the map would be non-nil. `nil` is a perfectly valid value to store in a map of pointers, and the comma-ok syntax is meant for when membership is distinct from having a non-zero value. There was only one or two cases that I saw that being used correctly for maps of pointers.	2017-08-30 09:49:31 -04:00
Stuart Carnie	51eb85193c	release lock to avoid dead lock when calling WalkWhereForSeriesIDs * WalkWhereForSeriesIDs may call SeriesIDs, which may attempt to upgrade from a `RLock` to a `Lock`, causing the dead lock	2017-08-29 16:12:51 -07:00
kun	5d5225e77d	Fix panic when engine closed in a shard	2017-08-29 17:22:45 +08:00
Stuart Carnie	0ced270197	fix race condition reading map	2017-08-28 13:36:49 -07:00
Edd Robinson	d011e43a1b	Address feedback	2017-08-23 10:47:01 +01:00
Edd Robinson	a5f4b929c9	Ensure Skip is called in test goroutine	2017-08-23 10:47:01 +01:00
Edd Robinson	9be7c5aaa6	Run relevant engine tests on both indexes	2017-08-23 10:47:01 +01:00
Edd Robinson	9c12607c3e	Ensure shard tests run with both indexes	2017-08-23 10:46:59 +01:00
Edd Robinson	e732cb7a39	Update benchmarks to use sub-benchmarks	2017-08-22 17:51:48 +01:00
Edd Robinson	dd808bb77a	Ensure TSI tests run with TSI index	2017-08-22 17:51:48 +01:00
Edd Robinson	bca4393494	Run most tests for both indexes	2017-08-22 17:51:48 +01:00
Jason Wilder	d305b89f74	Merge pull request #8726 from influxdata/jw-tsm-file-leak Fix leaking tmp file when large compaction aborted	2017-08-22 09:59:23 -05:00
Stuart Carnie	2ef9b489f0	Merge pull request #8727 from influxdata/sgc-finalizer log message when iterator is closed by finalizer	2017-08-22 07:29:38 -07:00
Stuart Carnie	d189621d07	log message when iterator closed by finalizer	2017-08-21 16:46:24 -07:00
Jason Wilder	e265d150be	Fix leaking tmp file when large compaction aborted If a large compaction was running and was aborted. It could would leave some tmp files around for files that it had fully written. The current active file was cleaned up, but already completed ones would not. This would occur when a TSM file needed to rollover due to size.	2017-08-21 17:04:57 -06:00
Jonathan A. Sternberg	5ce6007347	Merge pull request #8724 from influxdata/js-remove-unused-cursor This cursor implementation appears to be completely unused	2017-08-21 17:44:51 -05:00
Jonathan A. Sternberg	c0f7a8af5b	This cursor implementation appears to be completely unused Remove it so that its existence doesn't confuse someone that this is actually the cursor. The real cursors appear to be in file_store.gen.go.	2017-08-21 16:27:23 -05:00
Stuart Carnie	25edd7bfdf	naming	2017-08-17 15:47:47 -07:00
Stuart Carnie	c86dc0d103	redundant allocation is overwritten by line 1769	2017-08-17 11:12:41 -07:00
Stuart Carnie	823f903cc6	inputs are closed if Merge returns error and use <type>FinalizerIterator * <type>FinalizerIterator sets a runtime finalizer and calls Close when garbage collected. This will ensure any associated cursors are closed and the associated TSM files released * `query.Iterators#Merge` call could return an error and the inputs would not be closed, causing a cursor leak	2017-08-17 11:12:18 -07:00
Jason Wilder	85842503be	Fix deadlock in engine/measurement fields The OnReplace func ends up trying to acquire locks on MeasurementFields. When its called via snapshotting, this can deadlock because the snapshotting goroutine also holds an RLock on the engine. If a delete measurement calls is run at the right time, it will lock the MeasurementFields and try to acquire a lock on the engine to disable compactions. This creates a deadlock. To fix this, the OnReplace callback is moved to a function param to allow only Replace calls as part of a compaction to invoke it as opposed to both snapshotting and compactions. Fixes #8713	2017-08-16 16:43:40 -06:00
Jonathan A. Sternberg	697759613c	Remove time comparisons from the inner sections of the storage engine	2017-08-16 16:51:13 -05:00
Jonathan A. Sternberg	8bd04ebe39	Remove TimeRange function and replace with a more accurate ConditionExpr function The ConditionExpr function is more accurate because it parses the condition and ensures that time conditions are actually used correctly. That means that attempting to combine conditions with OR will not result in the query silently pretending it's an AND and nested conditions work correctly so there is only one way to read the query. It also extracts the non-time conditions into a separate condition so we can stop attempting to parse around the time conditions in lower layers of the storage engine. This change does not remove those hacks, but a following commit should be able to sanitize the condition and remove them.	2017-08-16 16:45:35 -05:00
Jonathan A. Sternberg	9a2357c2c0	Separate the query engine into a separate package This change provides a clear separation between the query engine mechanics and the query language so that the language can be parsed and dealt with separate from the query engine itself.	2017-08-16 13:38:43 -05:00
Stuart Carnie	3caeee8a24	fix: cursor leak when cur == nil and aux or conds is not empty	2017-08-16 09:17:20 -07:00
Ben Johnson	e0d8cb0ef3	Cardinality AST, parser, & rewriter fixes.	2017-08-16 09:27:29 -06:00
Ben Johnson	60ab1282ea	Refactor system iterators. Previously pseudo iterators could be created for meta data such as series, measurement, and tag data. These iterators were created at a higher level and lacked a lot of the power of the query engine. This commit moves system iterators down to the series level and supports the following: - _name - _seriesKey - _tagKey - _tagValue - _fieldKey These can be used as normal fields such as: SELECT _seriesKey FROM cpu This will return all the series keys for `cpu`.	2017-08-16 09:27:29 -06:00
Ben Johnson	c9b5d60753	Parse SHOW CARDINALITY.	2017-08-16 09:27:15 -06:00
Ben Johnson	c4e2ba25c3	Merge pull request #8669 from benbjohnson/1392-tsi-index-migration TSI Index Migration Tool	2017-08-16 09:16:03 -06:00
David Norton	1d8d739418	fix #8677 : check for snapshot size == 0	2017-08-16 09:43:56 -04:00
Jason Wilder	186e44d227	Merge pull request #8702 from influxdata/jw-monitor-cpu Reduce CPU usage when checking series cardinality	2017-08-15 16:02:17 -06:00
Jason Wilder	c74932de94	Limit shard cardinality checks to 1 per database The tag cardinality checks were run for all inmem shards. Since inmem shards share the same index, a lot of the work is redundant. Inmem shards also need to sort their measurmenet and tag keys which can be CPU intensive with many shards or higher cardinality. This changes the monitoring to just check one shard in each database which should lower CPU usage due to excessive sorting. The longer term solution is to use TSI which would not have this check or required sorting.	2017-08-15 12:17:18 -06:00
Ben Johnson	06bc3b6fbf	TSI Index Migration	2017-08-15 11:40:24 -06:00
Jason Wilder	90e2cadeb6	Fix drop measurement not dropping all data If there were multiple shards, drop measurement could update the index and remove the measurement before the other shards ran their deletes. This causes the later shards to not see any series to delete. The fix is to all deleteSeries to handle the index delete which already accounts for removing the measurement when it is fully removed from the index.	2017-08-15 11:19:45 -06:00
Jason Wilder	61b13eb12b	Fix partiallyRead logic The partiallyRead func didn't account for the initial values and would return true for blocks that had not been read at all. This causes a slower path during compactions that forces a block to be decoded when it could just be merged as is without decoded. This causes compactions to consume more CPU and run slower at times.	2017-08-14 16:44:32 -06:00
Edd Robinson	45969ef3c6	Allow tag filtering when using DELETE with tsi1	2017-08-14 19:09:36 +01:00
Joe LeGasse	1121b69a9e	auth: apply FGA to SHOW SERIES	2017-08-09 14:56:53 -04:00
Edd Robinson	0f648e5170	Remove unsafe shenanigans	2017-08-03 16:38:05 +01:00
Edd Robinson	2d57b599e9	Remove debugging statement	2017-08-02 17:24:00 +01:00
Edd Robinson	da676a79ae	Implement TSI iterator	2017-08-02 16:29:14 +01:00
Edd Robinson	befae864bd	Add tests for merge function	2017-08-02 14:10:52 +01:00
Edd Robinson	aa7095be5a	Use a merge-based approach for TagValues	2017-08-02 14:10:52 +01:00
Jason Wilder	94a48774b7	Pull in new index filter	2017-08-02 14:10:52 +01:00
Edd Robinson	1e9ce8e0a7	Add test for TagValues	2017-08-02 14:10:52 +01:00
Stuart Carnie	5449285c4c	Merge pull request #8652 from influxdata/sgc-literal-cursor Reduce allocations using nil cursors and literal value cursors	2017-08-01 10:20:24 -07:00
Jason Wilder	173276a409	Remove unused filestore reference Reduces cursor struct size from 119 bytes to 111.	2017-08-01 09:41:16 -06:00
Stuart Carnie	ff65f0f24d	Reduce allocations using nil cursors and literal value cursors ``` benchmark old ns/op new ns/op delta BenchmarkIntegerIterator_Next-8 82.8 22.7 -72.58% benchmark old allocs new allocs delta BenchmarkIntegerIterator_Next-8 3 0 -100.00% benchmark old bytes new bytes delta BenchmarkIntegerIterator_Next-8 32 0 -100.00% ```	2017-07-30 09:15:34 -07:00
Jason Wilder	3d12c62121	Avoid repeatedly growning decoded values slices	2017-07-28 11:00:56 -06:00
Jason Wilder	778000435a	Conver all keys from string to []byte in TSM engine This switches all the interfaces that take string series key to take a []byte. This eliminates many small allocations where we convert between to two repeatedly. Eventually, this change should propogate futher up the stack.	2017-07-28 11:00:50 -06:00
Jason Wilder	8009da0187	Remove some extra cursor buffers that are not needed	2017-07-28 10:53:07 -06:00
Jason Wilder	6582caa78b	Reduce allocations when creating KeyCursors The refs map was to increment the file references one time each. It doesn't hurt to increment them multiple times though. We also do not need to copy the files slice as we are accessing it under a read lock so it can't be changed.	2017-07-28 10:53:07 -06:00

1 2 3 4 5 ...

1889 Commits (f20cab6e994b0df80a8101c9085c3d13d40800b4)