influxdb

Commit Graph

Author	SHA1	Message	Date
Ben Johnson	c61db43dc2	Update tagKeyValue mutex to write lock. This commit changes the read lock to a write lock when calling the `ids()` function because `ids()` can mutate the underlying series ids slice.	2019-02-15 09:29:48 -07:00
Jeff Wendling	40d2b70376	Ensure that cached series id sets are Go heap backed	2019-02-12 15:20:24 -07:00
Ben Johnson	6e5226437a	Convert TagValueSeriesIDCache to use string fields. This commit changes `name`, `key`, and `value` to from `[]byte` to `string`.	2019-02-12 14:22:40 -07:00
Ben Johnson	aa3dfc0662	Merge pull request #11791 from influxdata/bj-revert-limit-full-compaction-1.8 Revert "Limit force-full and cold compaction size."	2019-02-11 12:50:55 -07:00
Ben Johnson	b87605f521	Fix shard epoch race.	2019-02-11 12:15:46 -07:00
Ben Johnson	198f6fde38	Fix deleteSeriesRange() race condition.	2019-02-11 11:29:09 -07:00
Ben Johnson	2dd913d71b	Revert "Limit force-full and cold compaction size." This reverts commit `40db64d0b9`.	2019-02-11 11:07:44 -07:00
Edd Robinson	05e7def600	Merge pull request #10332 from ludweeg/ludweeg/unslice Simplify s[:] to s where s is a slice	2019-02-11 10:24:43 +00:00
Grzegorz Pomykala	8448cf4a9c	build fixed	2019-02-06 16:27:59 +01:00
Jonathan A. Sternberg	2811cde76d	Merge pull request #10414 from seebs/seebs/valuerReuse reuse ValuerEval objects	2019-02-06 09:14:15 -06:00
Grzegorz Pomykala	fb3c837de9	code review sugestions applied	2019-02-06 09:10:51 +01:00
Seebs	5525240de3	reuse ValuerEval objects Scanner objects and iterators often need a ValuerEval. This object is created, often with a function call, and has at least one interface in it, so it allocates storage. Then it's dropped again right away. The only part of it that might be subject to change is usually a map. While the map's contents change over time, the actual map doesn't change for the lifetime of the object. So, in both iterators and scanners, stash the ValuerEval and continue reusing it. On a query returning a fair number of data points, this produces a small (<5% in practice) improvement in observed performance, visible as a significant reduction in time spent in runtime (mallocgc, newobject, etcetera). The performance improvement isn't big, but it's reasonably easy to evaluate it and establish that it's a safe change to make. Signed-off-by: seebs <seebs@seebs.net>	2019-02-05 15:10:23 -06:00
Ben Johnson	def9589584	Merge pull request #10522 from hahnjo/fix-compaction-cache-snapshots Fix compaction logic on infrequent cache snapshots	2019-02-04 08:34:03 -07:00
Ben Johnson	4083ae01e3	Merge branch '1.8' into hpb-no-series-rebuild-on-delete-when-series-still-in-cache	2019-02-04 08:32:04 -07:00
Edd Robinson	3a81921bb0	Merge pull request #10505 from hpbieker/hpb-no-series-rebuild-on-delete-without-overlap-timerange Do not rebuild series index on delete for series not overlapping in time	2019-02-04 03:22:00 -08:00
Ben Wells	e9bada090f	Fix misspelling identified by misspell	2019-02-03 20:27:43 +00:00
Ben Johnson	0c6d77d952	Merge pull request #9944 from michaelyou/hotfix-hashring-mod Hash ring's hash mod	2019-02-01 12:54:15 -08:00
Edd Robinson	301ab71ba0	Remove copy-on-write when caching bitmaps In the case of caching TSI bitmaps belonging to immutable .tsi files, the underlying bitset data can be mmapped. It is possible, though rare, for this data to be unmapped (e.g., via a TSI compaction) but for the cached bitmap to be subsequently read. This leads to a segfault. This only happens when copy-on-write is set to true on the roaring bitmap, because in that case only the internal pointers are cloned. This change will reduce the TSI cache performance by around 10%, which I have deemed to account for only a few microseconds typically.	2019-01-25 18:02:48 +00:00
Edd Robinson	efdddbb31a	Allow TSI bitset cache size to be configured This commit adds a config option to the tsdb Config allowing the size of the bitset cached in the TSI index to be specified. Setting the cache size to 0 will disable the cache.	2019-01-24 17:41:45 +00:00
Edd Robinson	e20541d2ba	Expose functional option for setting TSI cache size	2019-01-23 17:15:48 +00:00
Edd Robinson	3a055a6107	Fix cardinality estimation error This commit fixes an error in the TSI index with estimating the cardinality of series recently added and then removed.	2019-01-10 17:46:30 +00:00
Edd Robinson	77fe5a9a62	Treat fields and measurements as raw bytes	2018-12-19 14:38:50 +00:00
Edd Robinson	348dac1672	Add repro test case for UTF-8 issue	2018-12-19 14:38:31 +00:00
Ben Johnson	b88d852c54	Merge pull request #10536 from influxdata/bj-limit-full-compaction Limit force-full and cold compaction size.	2018-12-17 10:44:10 -07:00
Tanya Gordeeva	0a39786ea7	tsdb: mixed shard tests Specifically tests around the global index for fields with mixed shard types.	2018-12-13 08:31:49 -08:00
Grzegorz Pomykala	be97f36e66	trace on SeriesFile.Open() failure	2018-12-06 15:58:05 +01:00
Grzegorz Pomykala	fbfcfa0b31	reproduction for #10540	2018-12-06 14:36:44 +01:00
Grzegorz Pomykala	a346109198	do not acquire a lock upon closing a SeriesFile when called from Open() method	2018-12-06 11:59:03 +01:00
Ben Johnson	40db64d0b9	Limit force-full and cold compaction size. This commit limits the number of files that can be compacted in a single group when forcing a full compaction or when a shard becomes cold. This is to prevent too many files being compacted at the same time.	2018-12-05 10:18:56 -07:00
Stuart Carnie	39a3d2335e	chore(flux): Update to Flux 0.7.1 Resolve breaking API changes	2018-11-30 10:38:56 -07:00
Jeff Wendling	9f0cd683b9	Merge pull request #10516 from influxdata/jmw-conflict-concurrency tsdb: conflict based concurrency resolution	2018-11-29 14:14:24 -07:00
Ben Johnson	cd1e1ca755	Merge pull request #10525 from influxdata/bj-warn-series-file Skip and warn series files in retention policy directory.	2018-11-28 11:42:29 -07:00
Jeff Wendling	cca97bf9b9	Merge pull request #10517 from influxdata/jmw-always-cleanup-fields-index tsdb: clean up fields index for every kind of delete	2018-11-28 11:33:34 -07:00
Ben Johnson	298eddb82c	Skip and warn series files in retention policy directory.	2018-11-28 11:20:18 -07:00
Jeff Wendling	259f3fe6e5	tsdb: consider measurement drops per shard on inmem	2018-11-27 16:59:17 -07:00
Jeff Wendling	0a2f6191a6	tsdb: clean up fields index for every kind of delete Before this, if you deleted everything with `delete where true` for example, then you would be left with all of your measurements in the fields index. That would cause ghost fields to reappear if someone reinserted to the measurement. This fixes that by making it so the deepest most delete code checks if the measurement was removed from the index, and if so cleaning it up out of the fields index. Additionally, it fixes bugs in that cleanup code where if you had a measurement like "m1" and "m10", when iterating over the cache or file store, "m1" would match "m10" due to it only checking the prefix. This also has it check the character right after the measurement to be either a comma because tags started, or the first character of the field separator.	2018-11-27 16:12:06 -07:00
Jonas Hahnfeld	217772752d	Fix compaction logic on infrequent cache snapshots This change fixes #10511 that manifests when a shard is considered cold faster than its cache is snapshotted. This can happen if WAL is enabled because previously the code only considered the last modification of compacted tsm1 files. Instead Engine.LastModified() also takes the WAL into account if necessary.	2018-11-26 22:05:37 +01:00
Jeff Wendling	4cad51a604	tsdb: conflict based concurrency resolution There are some problematic races that occur when deletes happen against writes to the same points at the same time. This change introduces guards and an epoch based system to coordinate these modifications. A guard matches a point based on the time, measurement name, and some conditions loaded from an influxql expression. The intent is to be as precise as possible without allowing any false neagatives: if a point would be deleted, the guard must match it. We are allowed to match more points than necessary, at the cost of slowing down writes. The epoch based system keeps track of outstanding writes and deletes and their associated guards. When a delete operation is going to start, it waits until all current writes are done, and installs its guard, blocking all future writes that contain points that may conflict with the delete. This allows writes to disjoint points to proceed uncontended, and the implementation is optimized for assuming there are few outstanding deletes. For example, in the case that there are no deletes, a write just has to take a mutex, bump a counter, and compare a value against zero. The epoch trackers are per shard, so that different shards never have to contend with one another.	2018-11-21 19:19:53 -07:00
Jeff Wendling	030adf4bd5	tsdb: don't allow deletes to a database in mixed index mode TSI1 and inmem indexes have different properties during deletes. Specifically, inmem shares a global index across all shards, where every tsi1 index is contained to a specific shard. When deleting a series, it may cause the last reference to the series across all shards to be dropped, necessitating a removal from the series file. Since the inmem index shares the index across all shards, removing the series when it's removed from the series file is sufficient. However, in the case of a mixed index database, if the last shard is a TSI1 shard, the other inmem indexes are not available when we discover that it was the last reference to the series. This ends up leaving the series in the inmem index without a series id in the series file, causing all sorts of misbehavior. Rather than continue curling ourselves into a ball to try to fix this unsupported mode, give a helpful error message to the user that they must run their database in a non-mixed index mode to allow deletes.	2018-11-21 18:18:38 -07:00
Hans Petter Bieker	4670b8d65e	Removed file that should not have been added.	2018-11-20 16:39:27 +01:00
Hans Petter Bieker	1d5463e0a0	Do not rebuild series index on delete for series not overlapping in time.	2018-11-20 16:24:13 +01:00
Hans Petter Bieker	926f78d832	Do not rebuild series index on delete when the series still exists in the cache.	2018-11-20 10:34:59 +01:00
Stuart Carnie	c3d7f3de2b	fix: Allow compactor to make progress if v.MaxTime() != entry.MaxTime	2018-11-14 09:13:13 -07:00
Stuart Carnie	5d083887a5	chore: Compactor test which replicates issue #10465 Due to an encoding bug with simple8b, it is possible that the MaxTime for a TSM index entry does not match the last encoded timestamp.	2018-11-14 09:13:13 -07:00
Jonathan A. Sternberg	a16096cbc4	Merge pull request #9943 from michaelyou/hotfix-typo Some typo and Wrong position of comment	2018-11-05 12:36:05 -06:00
Tanya Gordeeva	8b8421049e	tsdb: benchmark for many fields	2018-11-02 18:49:28 -07:00
Tanya Gordeeva	7c9ff60413	tsdb/shard: reduce measurement field copying Removes cloning measurement fields on writes, instead atomically swaps out measurement field sets when fields are added (with new overhead of copying existing fields whenever a new one is added).	2018-11-02 18:49:17 -07:00
Tanya Gordeeva	f13a1293f2	tsdb/shard_test: add comparitive benchmarks for measurement cardinalities Reuses some existing benchmarks, but ensuring that we write equal numbers of points for comparison.	2018-11-02 18:49:17 -07:00
Edd Robinson	be662a5853	Fix TSM index maxtime modification	2018-10-29 15:44:31 +00:00
Edd Robinson	0f67d8f294	Merge pull request #10387 from influxdata/er-index-vars Add shards' index types to /debug/vars	2018-10-29 10:12:05 +00:00
Edd Robinson	42827219f3	Merge pull request #10423 from influxdata/er-nil-shard Fix panic in IndexSet	2018-10-26 19:05:08 +01:00
Jeff Wendling	5c2d36225d	fix(tsdb): copy measurement names when expression is provided We already make copies when no expression is provided, because the backing slices may go away if the shard they came from is closed. This fixes the other spot where some backing slices would be returned.	2018-10-26 11:25:25 -06:00
Edd Robinson	cade59e253	Fix panic in IndexSet This commit fixes a panic where a concurrent removal of a shard and meta query could cause a `nil` index to be added to the IndexSet`.	2018-10-26 18:23:54 +01:00
David Norton	3ad44c0ff4	error if manifest is read/written more than once This change makes the shard digest writer and reader return an error if the manifest is written or read more than once.	2018-10-22 14:42:05 -04:00
David Norton	3d01051dfc	make digest reader skip manifest if needed This change makes the digest reader read and discard the manifest if needed. Not all readers of a digest are interested in the manifest. This change also makes it a requirement for the writer to write a manifest because it is a non-optional part of a digest file.	2018-10-22 13:14:35 -04:00
Edd Robinson	9b4cf1e39c	Add the shard index type to /debug/vars This commit adds an `indexType` key to the shard sections of the `/debug/vars` endpoint, as well as the `_internal` shard statistics. The tag will be reported as `"indexType": "inmem"` or `"indexType": "tsi1"`.	2018-10-18 13:46:12 +01:00
Stuart Carnie	9520b8d956	fix(tsdb): Fix race calling filterShards outside a lock Move filterShards inside the lock, as it enumerates the shards map, which can result in data race when the map is written concurrently.	2018-10-17 14:14:53 -07:00
Stuart Carnie	0734f6fe21	feat(tsm1): Improve performance of Gorilla float block decoding ``` name old time/op new time/op delta FloatArrayDecodeAll/1-8 45.9ns ± 1% 13.8ns ± 1% -70.00% (p=0.000 n=9+9) FloatArrayDecodeAll/55-8 686ns ± 0% 232ns ± 1% -66.10% (p=0.000 n=9+8) FloatArrayDecodeAll/550-8 5.78µs ± 0% 2.22µs ± 1% -61.61% (p=0.000 n=9+9) FloatArrayDecodeAll/1000-8 10.2µs ± 2% 4.0µs ± 5% -60.47% (p=0.000 n=10+10) name old speed new speed delta FloatArrayDecodeAll/1-8 414MB/s ± 1% 1383MB/s ± 1% +233.76% (p=0.000 n=9+9) FloatArrayDecodeAll/55-8 144MB/s ± 0% 424MB/s ± 1% +194.19% (p=0.000 n=9+9) FloatArrayDecodeAll/550-8 133MB/s ± 0% 346MB/s ± 1% +160.09% (p=0.000 n=9+10) FloatArrayDecodeAll/1000-8 135MB/s ± 2% 340MB/s ± 5% +153.03% (p=0.000 n=10+10) ```	2018-10-16 17:28:36 -07:00
Stuart Carnie	4dccba29c3	chore(tsm1): go fmt file	2018-10-16 17:07:19 -07:00
Ben Johnson	a989b01356	Merge pull request #10249 from hpbieker/hpb-delete-from-prevent-rebuild-series Prevent DELETE FROM to rebuild series files for shards where nothing is deleted	2018-10-16 14:53:09 -06:00
Edd Robinson	5054d6fae4	Address PR feedback	2018-10-16 13:37:49 +01:00
Stuart Carnie	a792fbbdfa	fix(encoding): Improve array string encoding perf a little more Encode the compressed data at the start internal buffer. This ensures the returned slice maintains the entire capacity and is available for subsequent use. When we pool / reuse string buffers, this will help considerably. Improvements over previous commit: ``` name old time/op new time/op delta EncodeStrings/10/batch-8 542ns ± 1% 355ns ± 2% -34.53% (p=0.008 n=5+5) EncodeStrings/100/batch-8 5.29µs ± 1% 3.58µs ± 2% -32.20% (p=0.008 n=5+5) EncodeStrings/1000/batch-8 48.6µs ± 0% 36.2µs ± 2% -25.40% (p=0.008 n=5+5) name old alloc/op new alloc/op delta EncodeStrings/10/batch-8 704B ± 0% 0B -100.00% (p=0.008 n=5+5) EncodeStrings/100/batch-8 9.47kB ± 0% 0.00kB -100.00% (p=0.008 n=5+5) EncodeStrings/1000/batch-8 90.1kB ± 0% 0.0kB -100.00% (p=0.008 n=5+5) name old allocs/op new allocs/op delta EncodeStrings/10/batch-8 0.00 0.00 ~ (all equal) EncodeStrings/100/batch-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) EncodeStrings/1000/batch-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) ```	2018-10-16 12:08:12 +01:00
Stuart Carnie	964bc3c19e	fix(encoding): Improve simple8b another 6%; fix inconsequential bug simple8b encodes deltas[1:], thus deltas[0] >= simple8b.MaxValue is invalid. Also changed loop calculating deltas, RLE and max to be similar to batch timestamp, for greater consistency. Improvements over previous commit: ``` name old time/op new time/op delta name old time/op new time/op delta EncodeIntegers/1000_seq/batch-8 1.50µs ± 1% 1.48µs ± 1% -1.40% (p=0.008 n=5+5) EncodeIntegers/1000_ran/batch-8 6.10µs ± 0% 5.69µs ± 2% -6.58% (p=0.008 n=5+5) EncodeIntegers/1000_dup/batch-8 1.50µs ± 1% 1.49µs ± 0% -1.21% (p=0.008 n=5+5) ``` Improvements overall: ``` name old time/op new time/op delta EncodeIntegers/1000_seq/batch-8 2.04µs ± 0% 1.48µs ± 1% -27.25% (p=0.008 n=5+5) EncodeIntegers/1000_ran/batch-8 8.80µs ± 2% 5.69µs ± 2% -35.29% (p=0.008 n=5+5) EncodeIntegers/1000_dup/batch-8 2.03µs ± 1% 1.49µs ± 0% -26.93% (p=0.008 n=5+5) ```	2018-10-16 12:08:12 +01:00
Stuart Carnie	43f96a6ddf	feat(encoding): Improve timestamp encoding Timestamp improvements prior to any improvements to simple8b ``` name old time/op new time/op delta name old time/op new time/op delta EncodeTimestamps/1000_seq/batch-8 2.64µs ± 1% 1.36µs ± 1% -48.25% (p=0.008 n=5+5) EncodeTimestamps/1000_ran/batch-8 64.0µs ± 1% 32.2µs ± 1% -49.64% (p=0.008 n=5+5) EncodeTimestamps/1000_dup/batch-8 9.32µs ± 0% 1.30µs ± 1% -86.06% (p=0.008 n=5+5) ```	2018-10-16 12:08:12 +01:00
Stuart Carnie	e9531b7830	feat(encoding): Improve integer and simple8b encoding performance simple8b EncodeAll improvements should ``` name old time/op new time/op delta EncodeAll/1_bit-8 28.5µs ± 1% 28.6µs ± 1% ~ (p=0.133 n=9+10) EncodeAll/2_bits-8 28.9µs ± 2% 28.7µs ± 0% ~ (p=0.068 n=10+8) EncodeAll/3_bits-8 29.3µs ± 1% 28.8µs ± 0% -1.70% (p=0.000 n=10+10) EncodeAll/4_bits-8 29.6µs ± 1% 29.1µs ± 1% -1.85% (p=0.000 n=10+10) EncodeAll/5_bits-8 30.6µs ± 1% 29.8µs ± 2% -2.70% (p=0.000 n=10+10) EncodeAll/6_bits-8 31.3µs ± 1% 30.0µs ± 1% -4.08% (p=0.000 n=9+9) EncodeAll/7_bits-8 32.6µs ± 1% 30.8µs ± 0% -5.49% (p=0.000 n=9+9) EncodeAll/8_bits-8 33.6µs ± 2% 31.0µs ± 1% -7.77% (p=0.000 n=10+9) EncodeAll/10_bits-8 34.9µs ± 0% 31.9µs ± 2% -8.55% (p=0.000 n=9+10) EncodeAll/12_bits-8 36.8µs ± 1% 32.6µs ± 1% -11.35% (p=0.000 n=9+10) EncodeAll/15_bits-8 39.8µs ± 1% 34.1µs ± 2% -14.40% (p=0.000 n=10+10) EncodeAll/20_bits-8 45.2µs ± 3% 36.2µs ± 1% -19.97% (p=0.000 n=10+9) EncodeAll/30_bits-8 55.0µs ± 0% 40.9µs ± 1% -25.62% (p=0.000 n=9+9) EncodeAll/60_bits-8 86.2µs ± 1% 55.2µs ± 1% -35.92% (p=0.000 n=10+10) EncodeAll/combination-8 582µs ± 2% 502µs ± 1% -13.80% (p=0.000 n=9+9) ``` EncodeIntegers: ``` name old time/op new time/op delta EncodeIntegers/1000_seq/batch-8 2.04µs ± 0% 1.50µs ± 1% -26.22% (p=0.008 n=5+5) EncodeIntegers/1000_ran/batch-8 8.80µs ± 2% 6.10µs ± 0% -30.73% (p=0.008 n=5+5) EncodeIntegers/1000_dup/batch-8 2.03µs ± 1% 1.50µs ± 1% -26.04% (p=0.008 n=5+5) ``` EncodeTimestamps (ran is improved due to simple8b improvements) ``` name old time/op new time/op delta EncodeTimestamps/1000_seq/batch-8 2.64µs ± 1% 2.65µs ± 2% ~ (p=0.310 n=5+5) EncodeTimestamps/1000_ran/batch-8 64.0µs ± 1% 33.8µs ± 1% -47.23% (p=0.008 n=5+5) EncodeTimestamps/1000_dup/batch-8 9.32µs ± 0% 9.28µs ± 1% ~ (p=0.087 n=5+5) ```	2018-10-16 12:08:12 +01:00
Edd Robinson	91d0a8c3d2	Fix index bug in float encoder	2018-10-16 12:08:12 +01:00
Edd Robinson	09da18c08e	Add TSM batch key iterator The batch focussed TSM key iterator iterates TSM blocks, decoding and merging blocks where appropriate using the the batch focussed approaches.	2018-10-16 12:08:12 +01:00
Edd Robinson	51233b71a5	Add batch block encoders	2018-10-16 12:05:52 +01:00
Edd Robinson	592127e411	Batch oriented unsigned encoder	2018-10-16 12:05:52 +01:00
Edd Robinson	a7a70a920e	Batch oriented boolean encoders This commit adds a tsm1 function for encoding a batch of booleans into a provided buffer. The following benchmarks compare the performance of the existing iterator based encoders, and the new batch oriented encoders using randomly generated sets of booleans.	2018-10-16 12:05:52 +01:00
Jeff Wendling	a4d4ef6999	Improvements to batch float encoder - Inlined the closure to avoid a function call. - Changed append(b, make([]byte, 8)...) to inline the make call. - Check for NaN once at the end assuming NaN is infrequent. New performance delta comparing the current iterators to the new batch function: name old time/op new time/op delta EncodeFloats/10_seq 1.32µs ± 2% 0.17µs ± 2% -87.39% (p=0.000 n=10+10) EncodeFloats/10_ran 2.09µs ± 1% 0.15µs ± 0% -92.97% (p=0.000 n=10+9) EncodeFloats/100_seq 8.37µs ± 2% 1.28µs ± 2% -84.74% (p=0.000 n=10+10) EncodeFloats/100_ran 19.1µs ± 1% 1.3µs ± 1% -93.08% (p=0.000 n=9+9) EncodeFloats/1000_seq 60.4µs ± 1% 12.6µs ± 0% -79.13% (p=0.000 n=9+7) EncodeFloats/1000_ran 212µs ± 1% 12µs ± 1% -94.53% (p=0.000 n=9+8) name old alloc/op new alloc/op delta EncodeFloats/10_seq 0.00B 0.00B ~ (all equal) EncodeFloats/10_ran 0.00B 0.00B ~ (all equal) EncodeFloats/100_seq 0.00B 0.00B ~ (all equal) EncodeFloats/100_ran 0.00B 0.00B ~ (all equal) EncodeFloats/1000_seq 0.00B 0.00B ~ (all equal) EncodeFloats/1000_ran 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta EncodeFloats/10_seq 0.00 0.00 ~ (all equal) EncodeFloats/10_ran 0.00 0.00 ~ (all equal) EncodeFloats/100_seq 0.00 0.00 ~ (all equal) EncodeFloats/100_ran 0.00 0.00 ~ (all equal) EncodeFloats/1000_seq 0.00 0.00 ~ (all equal) EncodeFloats/1000_ran 0.00 0.00 ~ (all equal)	2018-10-16 12:05:52 +01:00
Edd Robinson	ee607f9288	Batch oriented string encoders This commit adds a tsm1 function for encoding a batch of strings into a provided buffer. The new function also shares the buffer between the input data and the snappy encoded output, reducing allocations. The following benchmarks compare the performance of the existing iterator based encoders, and the new batch oriented encoders using randomly generated strings. name old time/op new time/op delta EncodeStrings/10 2.14µs ± 4% 1.42µs ± 4% -33.56% (p=0.000 n=10+10) EncodeStrings/100 12.7µs ± 3% 10.9µs ± 2% -14.46% (p=0.000 n=10+10) EncodeStrings/1000 132µs ± 2% 114µs ± 2% -13.88% (p=0.000 n=10+9) name old alloc/op new alloc/op delta EncodeStrings/10 657B ± 0% 704B ± 0% +7.15% (p=0.000 n=10+10) EncodeStrings/100 6.14kB ± 0% 9.47kB ± 0% +54.14% (p=0.000 n=10+10) EncodeStrings/1000 61.4kB ± 0% 90.1kB ± 0% +46.66% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EncodeStrings/10 3.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeStrings/100 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) EncodeStrings/1000 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)	2018-10-16 12:05:52 +01:00
Edd Robinson	d1b7e02483	Batch oriented timestamp encoders This commit adds a tsm1 function for encoding a batch of timestamps into a provided buffer. The following benchmarks compare the performance of the existing iterator based encoders, and the new batch oriented encoders. They look at a sequential input slice, a randomly generated input slice and a duplicate slice. All slices are sorted. name old time/op new time/op delta EncodeTimestamps/10_seq 153ns ± 2% 104ns ± 2% -31.62% (p=0.000 n=9+10) EncodeTimestamps/10_ran 191ns ± 2% 142ns ± 0% -25.73% (p=0.000 n=10+9) EncodeTimestamps/10_dup 114ns ± 1% 68ns ± 4% -39.77% (p=0.000 n=8+10) EncodeTimestamps/100_seq 704ns ± 2% 321ns ± 2% -54.44% (p=0.000 n=9+9) EncodeTimestamps/100_ran 7.27µs ± 4% 7.01µs ± 2% -3.59% (p=0.000 n=10+10) EncodeTimestamps/100_dup 756ns ± 3% 396ns ± 2% -47.57% (p=0.000 n=10+10) EncodeTimestamps/1000_seq 6.32µs ± 1% 2.46µs ± 2% -61.01% (p=0.000 n=8+10) EncodeTimestamps/1000_ran 108µs ± 0% 68µs ± 3% -37.57% (p=0.000 n=8+10) EncodeTimestamps/1000_dup 7.26µs ± 1% 3.64µs ± 1% -49.80% (p=0.000 n=10+8) name old alloc/op new alloc/op delta EncodeTimestamps/10_seq 0.00B 0.00B ~ (all equal) EncodeTimestamps/10_ran 0.00B 0.00B ~ (all equal) EncodeTimestamps/10_dup 0.00B 0.00B ~ (all equal) EncodeTimestamps/100_seq 0.00B 0.00B ~ (all equal) EncodeTimestamps/100_ran 0.00B 0.00B ~ (all equal) EncodeTimestamps/100_dup 0.00B 0.00B ~ (all equal) EncodeTimestamps/1000_seq 0.00B 0.00B ~ (all equal) EncodeTimestamps/1000_ran 0.00B 0.00B ~ (all equal) EncodeTimestamps/1000_dup 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta EncodeTimestamps/10_seq 0.00 0.00 ~ (all equal) EncodeTimestamps/10_ran 0.00 0.00 ~ (all equal) EncodeTimestamps/10_dup 0.00 0.00 ~ (all equal) EncodeTimestamps/100_seq 0.00 0.00 ~ (all equal) EncodeTimestamps/100_ran 0.00 0.00 ~ (all equal) EncodeTimestamps/100_dup 0.00 0.00 ~ (all equal) EncodeTimestamps/1000_seq 0.00 0.00 ~ (all equal) EncodeTimestamps/1000_ran 0.00 0.00 ~ (all equal) EncodeTimestamps/1000_dup 0.00 0.00 ~ (all equal)	2018-10-16 12:05:52 +01:00
Edd Robinson	de5ca4a108	Batch oriented int encoders This commit adds a tsm1 function for encoding a batch of ints into a provided buffer. The following benchmarks compare the performance of the existing iterator based encoders, and the new batch oriented encoders. They look at a sequential input slice, a randomly generated input slice and a duplicate slice: name old time/op new time/op delta EncodeIntegers/10_seq 144ns ± 2% 41ns ± 1% -71.46% (p=0.000 n=10+10) EncodeIntegers/10_ran 304ns ± 7% 140ns ± 2% -53.99% (p=0.000 n=10+10) EncodeIntegers/10_dup 147ns ± 4% 41ns ± 2% -72.14% (p=0.000 n=10+9) EncodeIntegers/100_seq 483ns ± 7% 208ns ± 1% -56.98% (p=0.000 n=10+9) EncodeIntegers/100_ran 1.64µs ± 7% 1.01µs ± 1% -38.42% (p=0.000 n=9+9) EncodeIntegers/100_dup 484ns ±14% 210ns ± 2% -56.63% (p=0.000 n=10+10) EncodeIntegers/1000_seq 3.11µs ± 2% 1.81µs ± 2% -41.68% (p=0.000 n=10+10) EncodeIntegers/1000_ran 16.9µs ±10% 11.0µs ± 2% -34.58% (p=0.000 n=10+10) EncodeIntegers/1000_dup 3.05µs ± 3% 1.81µs ± 2% -40.71% (p=0.000 n=10+8) name old alloc/op new alloc/op delta EncodeIntegers/10_seq 32.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) EncodeIntegers/10_ran 32.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) EncodeIntegers/10_dup 32.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) EncodeIntegers/100_seq 32.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) EncodeIntegers/100_ran 128B ± 0% 0B -100.00% (p=0.000 n=10+10) EncodeIntegers/100_dup 32.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) EncodeIntegers/1000_seq 32.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) EncodeIntegers/1000_ran 1.15kB ± 0% 0.00kB -100.00% (p=0.000 n=10+10) EncodeIntegers/1000_dup 32.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EncodeIntegers/10_seq 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/10_ran 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/10_dup 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/100_seq 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/100_ran 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/100_dup 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/1000_seq 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/1000_ran 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) EncodeIntegers/1000_dup 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)	2018-10-16 12:05:52 +01:00
Edd Robinson	6b52231a37	Batch oriented float encoders This commit adds a tsm1 function for encoding a batch of floats into a buffer. Further, it replaces the `bitstream` library used in the existing encoders (and all the current decoders) with inlined bit expressions within the encoder, significantly reducing the function call overhead for larger batches. The following benchmarks compare the performance of the existing iterator based encoders, and the new batch oriented encoders. They look at a sequential input slice and a randomly generated input slice. name old time/op new time/op delta EncodeFloats/10_seq 1.14µs ± 3% 0.24µs ± 3% -78.94% (p=0.000 n=10+10) EncodeFloats/10_ran 1.69µs ± 2% 0.21µs ± 3% -87.43% (p=0.000 n=10+10) EncodeFloats/100_seq 7.07µs ± 1% 1.72µs ± 1% -75.62% (p=0.000 n=7+9) EncodeFloats/100_ran 15.8µs ± 4% 1.8µs ± 1% -88.60% (p=0.000 n=10+9) EncodeFloats/1000_seq 50.2µs ± 3% 16.2µs ± 2% -67.66% (p=0.000 n=10+10) EncodeFloats/1000_ran 174µs ± 2% 16µs ± 2% -90.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta EncodeFloats/10_seq 0.00B 0.00B ~ (all equal) EncodeFloats/10_ran 0.00B 0.00B ~ (all equal) EncodeFloats/100_seq 0.00B 0.00B ~ (all equal) EncodeFloats/100_ran 0.00B 0.00B ~ (all equal) EncodeFloats/1000_seq 0.00B 0.00B ~ (all equal) EncodeFloats/1000_ran 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta EncodeFloats/10_seq 0.00 0.00 ~ (all equal) EncodeFloats/10_ran 0.00 0.00 ~ (all equal) EncodeFloats/100_seq 0.00 0.00 ~ (all equal) EncodeFloats/100_ran 0.00 0.00 ~ (all equal) EncodeFloats/1000_seq 0.00 0.00 ~ (all equal) EncodeFloats/1000_ran 0.00 0.00 ~ (all equal)	2018-10-16 12:05:52 +01:00
Edd Robinson	9ecadd1a9c	Rename time batch decoders	2018-10-16 12:05:52 +01:00
Edd Robinson	c1d82fccf0	Rename unsigned batch decoders	2018-10-16 12:05:52 +01:00
Edd Robinson	536e7bb62f	Rename string batch decoders	2018-10-16 12:05:52 +01:00
Edd Robinson	d819378ad3	Rename boolean batch decoders	2018-10-16 12:05:52 +01:00
Edd Robinson	f81e75f4c1	Rename integer batch decoders	2018-10-16 12:05:52 +01:00
Edd Robinson	12bb8881be	Rename float batch decoders	2018-10-16 12:05:52 +01:00
Jonathan A. Sternberg	af8bf99256	Do not panic when a series id iterator is nil	2018-10-11 15:16:59 -05:00
Jeff Wendling	69dc031a75	Use platform for most of the read service code This commit deletes most of the code to service reads from influxdb and pulls it in from platform instead. Of note, the models.Tag and models.Tags types are now aliases to the platform models.Tag and models.Tags types. Additionally, many types in the tsdb package relating to cursors are also aliases to the same types in the platform cursors package. This updates the platform and flux repos to the current master in the Gopkg.lock.	2018-10-10 11:20:25 -06:00
Ben Johnson	844b7ef9bf	Merge pull request #10299 from influxdata/bj-tsm1-panic-fix Fix TSM1 panic on reader error.	2018-10-10 08:12:17 -06:00
Edd Robinson	ee61ed3dca	Merge pull request #10327 from influxdata/er-duplicate-tsm Cleanup failed TSM snapshots	2018-10-09 17:08:04 +01:00
Ben Johnson	1580f90be4	Merge pull request #10339 from influxdata/bj-fix-series-index-tombstone Fix series file tombstoning.	2018-10-08 08:35:05 -06:00
Ben Johnson	2cb97146f0	Fix series file tombstoning. This commit fixes an issue with the series file compaction process where tombstones are lost after compaction and series existence checks are not correct. This commit also fixes some smaller flushing issues within the series file that mainly related to testing.	2018-10-05 08:23:25 -06:00
ludweeg	5622355526	Simplify s[:] to s where s is a slice	2018-10-04 17:10:21 +03:00
Edd Robinson	d649d5928b	Cleanup failed TSM snapshot If there was an error after the cache has been snapshotted to one or more TSM files, but before the cache and WAL are cleaned up, then the cache would be repeatedly snapshotted, generated duplicate level 1 TSM files. This commit attempts to clean those files up by removing the temporary TSM file(s). The snapshot will be retried.	2018-10-03 16:34:54 +01:00
Ben Johnson	bdcbad3fc9	Fix append of possible nil iterator. This commit updates an iterator list to ignore `nil` iterators. Adding a `nil` caused the `SeriesIterators.Close()` to panic.	2018-10-02 13:19:21 -06:00
Ben Johnson	0d777ad423	Fix tsi1 sketch locking.	2018-09-26 17:01:47 -06:00
Ben Johnson	da2dfa495e	Fix TSM1 panic on reader error. This commit fixes an error check so that a `nil` TSM reader does not cause a panic.	2018-09-24 08:54:28 -06:00
Edd Robinson	812ac6da25	PR feedback	2018-09-18 15:58:38 -07:00
Edd Robinson	a15bdeef92	Fix megacheck	2018-09-18 15:58:38 -07:00
Edd Robinson	76237d80f2	Address PR feedback	2018-09-18 15:58:38 -07:00
Ben Johnson	e651153f1c	Add TagValueSeriesIDCache.Delete().	2018-09-18 15:58:38 -07:00
Ben Johnson	fcbc03240a	Inline mutex into TagValueSeriesIDCache.	2018-09-18 15:58:38 -07:00
Ben Johnson	e4f8637234	Fix ParseSeriesKeyInto() buffer shrinkage.	2018-09-18 15:58:38 -07:00
Edd Robinson	bdc293abdd	Tidy up	2018-09-18 15:58:38 -07:00
Edd Robinson	cc6f8c3502	Reduce allocations in TSI TagSets implementation Since all tag sets are materialised to strings before this method returns, a large number of allocations can be avoided by carefully resuing buffers and containers. This commit reduces allocations by about 75%, which can be very significant for high cardinality workloads. The benchmark results shown below are for a benchmark that asks for all series keys matching `tag5=value0'. There are 100K matching series keys. benchmark old ns/op new ns/op delta BenchmarkIndexSet_TagSets/1M_series/inmem-8 10959963 11144345 +1.68% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 23632757 18768888 -20.58% BenchmarkIndexSet_TagSets/1M_series/inmem-8 10496303 10380551 -1.10% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 24344359 19020234 -21.87% BenchmarkIndexSet_TagSets/1M_series/inmem-8 10359864 10818296 +4.43% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 23453357 19027445 -18.87% BenchmarkIndexSet_TagSets/1M_series/inmem-8 10479519 10400619 -0.75% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 26364965 19023749 -27.84% BenchmarkIndexSet_TagSets/1M_series/inmem-8 10437794 10557066 +1.14% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 23126946 19196955 -16.99% benchmark old allocs new allocs delta BenchmarkIndexSet_TagSets/1M_series/inmem-8 51 51 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 80067 20071 -74.93% BenchmarkIndexSet_TagSets/1M_series/inmem-8 51 51 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 80067 20071 -74.93% BenchmarkIndexSet_TagSets/1M_series/inmem-8 51 51 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 80067 20071 -74.93% BenchmarkIndexSet_TagSets/1M_series/inmem-8 51 51 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 80067 20071 -74.93% BenchmarkIndexSet_TagSets/1M_series/inmem-8 51 51 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 80067 20071 -74.93% benchmark old bytes new bytes delta BenchmarkIndexSet_TagSets/1M_series/inmem-8 3556728 3556728 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 12677328 5157992 -59.31% BenchmarkIndexSet_TagSets/1M_series/inmem-8 3556728 3556728 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 12677328 5157992 -59.31% BenchmarkIndexSet_TagSets/1M_series/inmem-8 3556728 3556728 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 12677328 5157992 -59.31% BenchmarkIndexSet_TagSets/1M_series/inmem-8 3556728 3556728 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 12677328 5157992 -59.31% BenchmarkIndexSet_TagSets/1M_series/inmem-8 3556728 3556728 +0.00% BenchmarkIndexSet_TagSets/1M_series/tsi1-8 12677328 5157992 -59.31%	2018-09-18 15:58:38 -07:00
Edd Robinson	d8af622333	Add benchmark for TagSets across indexes	2018-09-18 15:58:38 -07:00
Edd Robinson	5c88a1dd0e	Fix locking on cache	2018-09-18 15:58:38 -07:00
Edd Robinson	6d12f5d323	Debug	2018-09-18 15:58:38 -07:00
Edd Robinson	8af7c133db	Refactor cache	2018-09-18 15:58:38 -07:00
Edd Robinson	1ae716b64e	Use copy-on-write when cloning bitmaps This commit sets the copy-on-write feature of the SeriesIDSets, such that we can make immutable clones of underlying bitmaps efficiently. If the original bitmap is modified then a copy will be made, which won't affect the clone.	2018-09-18 15:58:38 -07:00
Edd Robinson	baf35f2138	Add benchmarks for cache and option to disable	2018-09-18 15:58:38 -07:00
Edd Robinson	3f6ef0ba22	Update cached bitset results with new series ids This commit ensures that cached bitset results at the Index level are updated whenever new series ids are created that would belong in those bitsets. For example, if we have a cached bitset for the tuple {mem, region, west}, and we add the series mem,host=prod,region=west then we would update the cached bitset for {mem, region, west} with the series id of the newly written series.	2018-09-18 15:58:38 -07:00
Edd Robinson	065d47e4f2	Return created series ids from LogFile insertion	2018-09-18 15:58:38 -07:00
Edd Robinson	52b5640a4a	Add test for TagValueSeriesIDIterator	2018-09-18 15:58:38 -07:00
Edd Robinson	81f640e9ae	Add more methods and benchmarks to SeriesIDSet	2018-09-18 15:58:38 -07:00
Edd Robinson	9fb301cf10	Add CreateSeriesListIfNotExists benchmark	2018-09-18 15:58:38 -07:00
Edd Robinson	2c4c79f110	Convert cache to LRU	2018-09-18 15:58:38 -07:00
Edd Robinson	2ae2157d02	debug	2018-09-18 15:58:38 -07:00
Edd Robinson	74b3d35e40	Basic cache	2018-09-18 15:58:38 -07:00
Edd Robinson	7d00a45ebf	Don't allocate when reading tombstone SeriesID set	2018-09-18 15:58:38 -07:00
Edd Robinson	722ca22c79	Switch to influxdata fork	2018-09-18 15:58:38 -07:00
Edd Robinson	ca07a38402	Add benchmark for TagValueSeriesIDIterator	2018-09-18 15:53:52 -07:00
linxGnu	1a236cf629	Update test case	2018-09-14 14:09:24 -07:00
linxGnu	1dde9a1e12	Update test case	2018-09-14 14:09:24 -07:00
linxGnu	6f10f54fd0	Update test case and documentation	2018-09-14 14:09:24 -07:00
linxGnu	2356a30833	Fix bug on array values in tsdb storage engine	2018-09-14 14:09:24 -07:00
Ben Johnson	f67bbe76b5	Merge pull request #10278 from influxdata/hll-memory Remove TSI1 HLL sketches from heap.	2018-09-12 14:46:25 -06:00
Ben Johnson	88d006a18c	Remove TSI1 HLL sketches from heap. This commit removes the HLL sketches on each `tsi1.LogFile` and `tsi1.IndexFile` and instead caches the data at the `tsi1.Index` level. This reduces the heap size significantly for servers with many TSI-enabled shards.	2018-09-12 08:48:40 -06:00
Stuart Carnie	a940ebd45a	fix(tsm1): Fix FloatBatchDecodeAll to return empty slice an no error FloatBatchDecodeAll behaves the same as the iterator-based float decoder, returning an empty slice and no error when passed a buffer with no encoded float values. Fixes #10270	2018-09-10 13:59:47 -07:00
Stuart Carnie	4e91c8d33d	Revert: Unmap LogFile on successful open Resolves a panic when attempting to sort the `logMeasurements` slice, which holds on to mmap'd data.	2018-09-04 15:16:09 -06:00
Hans Petter Bieker	de3a2d657d	Fixed indentation.	2018-08-31 11:01:45 +02:00
Hans Petter Bieker	28f5fb4ea5	Prevent rebuilding of series files for shards where nothing is deleted.	2018-08-31 10:51:38 +02:00
Jeff Wendling	4e62c3f795	fix(tsm1): return boolean array iterator for booleans booleans are still not strings.	2018-08-27 11:32:15 -06:00
Stuart Carnie	2f4fcd8255	chore: Remove BatchCursor references	2018-08-24 11:56:04 -07:00
Edd Robinson	74185d29e6	Merge pull request #10226 from influxdata/er-tsl-unmap Unmap LogFile on successful open	2018-08-24 12:12:14 +01:00
David Norton	05d979d6b1	Merge pull request #10215 from influxdata/dn-snappy-digests Switch digests to use snappy compression	2018-08-23 13:17:28 -04:00
David Norton	2f6a1fc03b	switch digests to use snappy compression	2018-08-23 13:02:12 -04:00
Edd Robinson	9970620ee0	Unmap LogFile on successful open Since we append to the file itself, once we have read the file in, we can be done with the mmap'd data. Ideally we can rework UnmarshalBinary and do away with the mmap completely. That is future work.	2018-08-23 17:24:22 +01:00
Stuart Carnie	e685556c81	fix(tsm1): Fix panic when calling Close twice on a descending cursor.	2018-08-22 13:49:59 -07:00
Jeff Wendling	6150bc1eea	Merge pull request #10217 from influxdata/jmw-cursor-iterator-fix fix(tsm1): return boolean iterator for booleans	2018-08-21 14:31:38 -06:00
Jeff Wendling	d258361fb4	fix(tsm1): return boolean iterator for booleans	2018-08-21 13:51:35 -06:00
Edd Robinson	f52de2d1e7	Ensure orphaned series removed from inmem index This commit ensures that any orphaned series (series that are to be removed and no longer are referenced anywhere in the database) are removed from the `inmem` index when a shard is dropped.	2018-08-21 15:00:35 +01:00
Edd Robinson	dece5b847f	Refactor index names	2018-08-21 14:32:30 +01:00
Edd Robinson	a67f15fad4	Promote DropSeriesGlobal to Index interface	2018-08-20 17:57:16 +01:00
Edd Robinson	035b26cadd	Refactor DropSeriesGlobal	2018-08-20 16:37:55 +01:00
Ben Johnson	2d266ca186	Merge pull request #9801 from influxdata/bj-validate-write Add option for unicode validation.	2018-08-20 03:44:41 -10:00
Edd Robinson	6b3860e9a1	Reduce allocations in TSI TagSets implementation Since all tag sets are materialised to strings before this method returns, a large number of allocations can be avoided by carefully resuing buffers and containers. This commit reduces allocations by about 75%, which can be very significant for high cardinality workloads. The benchmark results shown below are for a benchmark that asks for all series keys matching `tag5=value0'. name old time/op new time/op delta Index_ConcurrentWriteQuery/inmem/queries_100000-8 5.66s ± 4% 5.70s ± 5% ~ (p=0.739 n=10+10) Index_ConcurrentWriteQuery/tsi1/queries_100000-8 26.5s ± 8% 26.8s ±12% ~ (p=0.579 n=10+10) IndexSet_TagSets/1M_series/inmem-8 11.9ms ±18% 10.4ms ± 2% -12.81% (p=0.000 n=10+10) IndexSet_TagSets/1M_series/tsi1-8 23.4ms ± 5% 18.9ms ± 1% -19.07% (p=0.000 n=10+9) name old alloc/op new alloc/op delta Index_ConcurrentWriteQuery/inmem/queries_100000-8 2.50GB ± 0% 2.50GB ± 0% ~ (p=0.315 n=10+10) Index_ConcurrentWriteQuery/tsi1/queries_100000-8 32.6GB ± 0% 32.6GB ± 0% ~ (p=0.247 n=10+10) IndexSet_TagSets/1M_series/inmem-8 3.56MB ± 0% 3.56MB ± 0% ~ (all equal) IndexSet_TagSets/1M_series/tsi1-8 12.7MB ± 0% 5.2MB ± 0% -59.02% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Index_ConcurrentWriteQuery/inmem/queries_100000-8 24.0M ± 0% 24.0M ± 0% ~ (p=0.353 n=10+10) Index_ConcurrentWriteQuery/tsi1/queries_100000-8 96.6M ± 0% 96.7M ± 0% ~ (p=0.579 n=10+10) IndexSet_TagSets/1M_series/inmem-8 51.0 ± 0% 51.0 ± 0% ~ (all equal) IndexSet_TagSets/1M_series/tsi1-8 80.4k ± 0% 20.4k ± 0% -74.65% (p=0.000 n=10+10)	2018-08-10 16:01:49 +01:00
Edd Robinson	9f883c8dee	Add benchmark for TagSets across indexes	2018-08-10 16:01:49 +01:00
Stuart Carnie	12f7f45707	feat(tsdb): Add CursorType to enable selection of batch cursors	2018-08-10 06:39:14 -07:00
Jacob Marble	f1fc1b0264	Merge pull request #10175 from influxdata/jgm-copy-byte-slices tsdb: Copy return value of IndexSet.MeasurementNamesByExpr	2018-08-09 11:20:45 -07:00
Stuart Carnie	990824ceca	fix(tsdb): Fix panic, don't add nil iterator to slice fixes #10171	2018-08-09 10:12:49 -07:00
Jacob Marble	7bd9b2a627	tsdb: Copy return value of IndexSet.MeasurementNamesByExpr	2018-08-08 23:48:06 -07:00
Jonathan A. Sternberg	ebd77d1b3d	Merge pull request #10170 from influxdata/js-create-series-cursor-panic Prevent a panic from occuring when CreateSeriesCursor fails	2018-08-08 08:19:40 -05:00
Jonathan A. Sternberg	fe996612eb	Prevent a panic from occuring when CreateSeriesCursor fails The internals of `newSeriesCursor` returned a struct pointer that implicitly got turned into the interface. Unfortunately, Go treats this type of interface conversion as a nil pointer to the struct rather than as just nil so if you attempted to compare the returned cursor to nil, they would not be equal and it would think it was non-nil and attempt to use the cursor.	2018-08-07 22:55:02 -05:00
Jacob Marble	786d637780	tsdb: Cleanup compaction throughput code	2018-08-07 11:12:41 -07:00
Zach Goldstein	0ef3752a1a	Add configuration parameter to expose rate limit for TSM compaction. Closes: 9938	2018-08-07 10:05:36 -04:00
Edd Robinson	3bcb8ad9b2	Merge pull request #10161 from influxdata/er-tidy Simplify loops	2018-08-06 15:24:49 +01:00
Edd Robinson	9eece563b1	Simplify loops	2018-08-05 15:16:33 +01:00
David Norton	50bbf11299	add digest manifest	2018-08-03 15:17:08 -04:00
Edd Robinson	996bb9bfa6	Wire in mmap advise hint to TSMReader	2018-08-03 16:27:39 +01:00
Edd Robinson	282d265dd4	Add mmap madvice kernel value config option This PR adds a configuration option that can be used to inform the kernel that we intent to page in much of the TSM files. This madvise value has been problematic in the past when its been set, so this option defaults to off. It may be useful to some users with slow disks.	2018-08-03 14:07:46 +01:00
Edd Robinson	19a4f1c9b0	Fix megacheck	2018-07-31 15:22:54 +01:00
Edd Robinson	7662249fb9	Revert to RoaringBitmap org	2018-07-31 15:17:03 +01:00
Edd Robinson	61af08abde	Fix megacheck	2018-07-31 15:03:54 +01:00
Ben Johnson	5612511a8f	Use roaring.Bitmap.FromBuffer(), remove memory alignment.	2018-07-30 13:42:13 +00:00
Ben Johnson	66920a181a	Add legacy tsi1 uvarint encoding test.	2018-07-27 15:43:14 +01:00
Ben Johnson	80d01325f8	Refactor file set tag value iterators to support series sets & tombstones.	2018-07-26 23:48:27 +01:00
Ben Johnson	cb828f0187	Fix roaring dependency, minor PR fixes.	2018-07-26 09:32:43 +01:00
Ben Johnson	fdfd038401	Add roaring bitmaps to TSI index files.	2018-07-24 17:59:23 +01:00
Jeff Wendling	63fbf53699	Merge pull request #10063 from influxdata/jmw-extra-log-context Make store include context in logs	2018-07-18 11:53:22 -06:00
Edd Robinson	95db829631	Remove default max concurrent compaction limit PR #9204 introduced a maximum default concurrent compaction limit of 4. The idea was to reduce IO utilisation on large systems with many cores, and high write load. Often on these systems, disks were not scaled appropriately to to the write volume, and while the write path could keep up, compactions would saturate disks. In #9225 work was done to reduce IO saturation by limiting the compaction throughput. To some extent, both #9204 and #9225 work towards solving the same problem. We have recently begun to notice larger clusters to suffer from situations where compactions are not keeping up because they have been scaled up, but the limit of 4 has stayed in place. While users can manually override the setting, it seems more user friendly if we remove the limit by default, and set it manually in cases where compactions are causing too much IO on large boxes.	2018-07-18 17:27:49 +01:00
Edd Robinson	55ffeb563a	Tidy up logging of compaction settings	2018-07-18 17:26:34 +01:00
Jeff Wendling	7bdbe26534	Make store include context in logs If some error or message is in the context of some shard or database be sure to include it in the message.	2018-07-18 10:22:53 -06:00
Edd Robinson	80dc07cbcb	Efficient means of getting fields for measurement If it's known that the read request only needs to use a single measurement, then we can avoid the need to get field keys via the query engine. However, that means that a new method of getting the field keys for a measurement would be needed. This commit exposes a method to efficiently get field key names for a measurement across multiple shards. name	2018-07-18 12:21:54 +01:00
Edd Robinson	9c5c1c7001	Optimisation for expressions with single measument	2018-07-18 12:21:54 +01:00
Jeff Wendling	f5ed934646	Merge pull request #10089 from influxdata/jmw-radix-sort inmem: use radix sort for series ids	2018-07-17 17:45:41 -06:00
Jeff Wendling	d979518135	inmem: use radix sort for series ids	2018-07-17 12:31:12 -06:00
Ben Johnson	b05e83e8ef	Merge pull request #10021 from huhue/sfile_fix bug fix from seriesfile enable EnableCompactions function	2018-07-17 15:42:00 +01:00
David Norton	6016a80997	allow tag keys to contain underscores	2018-07-17 09:39:08 -04:00
Stuart Carnie	d977c0ac24	fix(tsdb): Fix existing Prometheus tests based on batch cursors	2018-07-16 08:55:37 -07:00
Stuart Carnie	497fc42779	pr(tsdb): Feedback items from megacheck * batch cursors and cursorIterator will be removed in a follow up PR using Arrow array data structures	2018-07-16 08:55:37 -07:00
Stuart Carnie	910d0fe5e6	feat(tsm1): ArrayCursor interfaces and implementations Array cursors are enabled for storage RPC calls tsm1: * Implemented cursors that utilize Array decoders storage: * Abstractions to easily switch to Array cursors	2018-07-16 08:55:37 -07:00
Stuart Carnie	3632df77a6	feat(tsm1): Add Read<type>ArrayBlock APIs to FileStore * introduced tmpl from Arrow, which allows existing templates to be reused with additional command-line properties to control output. * duplicated suite of ReadFloatBlock tests for ReadFloatArrayBlock * only the float data type is tested as the Read APIs are generated from a single template.	2018-07-16 08:55:37 -07:00
Stuart Carnie	790639d728	feat(tsm1): Add Read<Type>ArrayBlock APIs to TSMReader and mmapAccessor	2018-07-16 08:55:37 -07:00
Stuart Carnie	0841c51d93	pr(tsdb): Feedback items from PR review	2018-07-13 11:42:02 -07:00
Stuart Carnie	9cd31520ec	feat(tsm1): Implement APIs to decode TSM data into array data structures * These APIs will be used by `TSMReader` and `KeyCursor` types via new APIs, using similar naming convention (Array)	2018-07-13 11:42:02 -07:00
Stuart Carnie	9c29cd69e5	feat(tsm1): Provide columnar value types * separate slices for time and values * structured to be Arrow ready * batch decoders fill time and value slices independently that vastly improves performance (benchmarks linked in PR)	2018-07-13 11:42:02 -07:00
Stuart Carnie	b3e53ae2dc	feat(tsm1): New APIs to decode an entire buffer of data * APIs decode an entire byte slice of encoded data into the provided `dst` slice * APIs are stateless and in almost all cases avoid any allocations * Intended to be used future batch-oriented TSM block decode APIs * duplicated tests from original iterator-based APIs	2018-07-13 11:42:02 -07:00
Stuart Carnie	06257822c2	fix(tsm1): Reset vals to ensure Include is correctly tested	2018-07-13 11:42:02 -07:00
Stuart Carnie	7948a8e217	chore(tsm1): Add benchmarks for existing typed decoders These benchmarks will be implemented in batched decoders to compare performance.	2018-07-13 11:42:02 -07:00
Jacob Marble	ffe54d0239	Revert "Resolve deadlock" This reverts commit `681f22b078`.	2018-07-09 22:05:54 -07:00
Edd Robinson	ad388a8fd8	Address PR feedback	2018-07-09 11:51:48 +01:00
Edd Robinson	11bea138f8	Restrict buffer size	2018-07-09 11:51:48 +01:00
Edd Robinson	96ed566e6c	Store series ID sets in LogFile as bitmaps This commit swaps out map[uint64]struct{} implementations for roaring bitmaps, which in turn improves memory usage and read performance. The bitmap implementation is abstracted such that for low cardinality sets a simple slice of ids is used, to reduce in-use memory.	2018-07-09 11:51:48 +01:00
Edd Robinson	13f896b9ff	Buffer writing of .tsl file with 128K buffer	2018-07-09 11:51:48 +01:00
Edd Robinson	3cf20823e9	Allow LogFile buffer size to be changed When adding many series using offline tooling, it's likely that every series involves an entry being appended to a LogFile. Typically an entry is 11 or 12 bytes, but the default bufio.Writer buffer size is only 4K. This means by default a write of 10,000 new series would involve ~30 buffer flushes. This commit makes the buffer configurable, and sets the value in `buildtsi` such that it reflects the number of series being written to the LogFile.	2018-07-09 11:51:48 +01:00
Edd Robinson	681af04815	Optionally disable buffer flushing/file syncing When running offline tooling, flushing buffers and syncing files on every write to a `LogFile` is not necessary. Were a hard exit with data loss to occur, the tooling can simply be run again.	2018-07-09 11:51:15 +01:00
Jacob Marble	b7d5e2ecdf	Merge pull request #10050 from influxdata/jgm-delete-regex Resolve deadlock deleting from many measurements concurrently	2018-07-08 17:01:33 -07:00
Jacob Marble	681f22b078	Resolve deadlock TSI LogFile compactions occasionally race with insert and delete operations because the index partition FileSet is retained needlessly by the method that calls Partition.CheckLogFile. In this change: - TSI LogFile compaction respects enable/disable compactions - Partition FileSet.Release before log compaction is triggered An alternative to the second step is to handle log file compaction in a new goroutine. Log file compaction errors would be logged and not returned to the caller. After this change, `DELETE FROM /regex/` does not deadlock; performance: - 30s to delete 100 measurements - 5m30s to delete 1000 measurements	2018-07-06 15:02:38 -07:00
Jacob Marble	2ac811e57e	close objects without swallowing errors	2018-07-06 13:45:22 -07:00
Jacob Marble	0af22b5992	Partition receiver rename Got tired of referring to Index Partitions as `i` instead of `p`.	2018-07-05 14:28:00 -07:00
Jacob Marble	dcb85d2e92	Init TSI partition logger TSI Partition logging was never initialized because WithLogger was called after Open; Open initializes Partition loggers.	2018-07-05 14:27:09 -07:00
Ben Johnson	979d790154	Implement bitset iterator	2018-07-05 09:01:22 -06:00
Edd Robinson	6059db3d3a	Filter series IDs at the last possible moment	2018-07-02 16:48:40 +01:00
Edd Robinson	609b980671	Don't filter at low-level	2018-07-02 16:47:44 +01:00

... 2 3 4 5 6 ...

2748 Commits (b17f27a5d987585b2277d435797247e45c2898a0)