influxdb

Commit Graph

Author	SHA1	Message	Date
Jonathan A. Sternberg	32e42b93ae	Merge pull request #6705 from influxdata/js-6701-duplicate-points-with-select Filter out sources that do not match the shard database/retention policy	2016-05-24 09:48:31 -04:00
Jonathan A. Sternberg	5e7e0bd19b	Filter out sources that do not match the shard database/retention policy If you use a statement like this: SELECT value FROM one..cpu, two..cpu It will access both the `one` and `two` databases as if you had selected the `cpu` measurement twice for both of them. Updated the `tsdb.Shard` create iterator function to filter out any sources that do not apply to that shard so this duplication doesn't happen. Fixes #6701.	2016-05-23 17:05:33 -04:00
Jason Wilder	f48a106860	Optimized timestamp run-length decoding Removes the up-front allocation of decoded values and return them as needed.	2016-05-23 14:05:25 -06:00
Edd Robinson	0b2a806789	Merge pull request #6690 from influxdata/jw-shard-size Fix panic in shard.DiskSize()	2016-05-20 15:29:53 +01:00
Edd Robinson	40732a35d0	Merge pull request #6660 from influxdata/er-vet Fix vet issues	2016-05-20 11:12:25 +01:00
Jason Wilder	d324777bfc	Fix panic in shard.DiskSize() If the wal or data dir is not accessible (possibly deleted), the DiskSize walk funcs could panic because they did not check the error passed in.	2016-05-19 23:19:44 -06:00
Jonathan A. Sternberg	5621ccc2ce	Remove limit optimization when using an aggregate The limit optimization was put into the wrong place and caused only part of the shard to be read when a limit was used. The optimization is possible, but requires a bit of refactoring to the code here so the call iterator is created per series before handed to the limit iterator. Fixes #6661.	2016-05-19 10:29:38 -04:00
Jason Wilder	4c089a56f4	Fix read tombstones: EOF Due to an bug in TSM tombstone files, it was possible to create empty tombstone files. At startup, the TSM file would error out and not load the TSM file. Instead, treat it as an empty v1 file so the TSM file can load correctly. Fixes #6641	2016-05-18 23:29:25 -06:00
Jason Wilder	7fb7faaaca	Fix points already read from being returned more than once If there were duplicate points in multiple blocks, we would correctly dedup the points and mark the regions of the blocks we've read. Unfortunately, we were not excluding the already points as the cursor moved to points in the later blocks which could cause points to be return twice incorrectly. Fixes #6611	2016-05-18 17:21:10 -06:00
Jason Wilder	9f89420b4c	Merge pull request #6653 from influxdata/jw-compact-fix Compaction fixes	2016-05-18 16:10:10 -06:00
Jason Wilder	121195a865	Merge pull request #6665 from influxdata/jw-series-stats Reload series count stat at startup	2016-05-18 15:58:15 -06:00
Edd Robinson	09dc48b847	Merge pull request #6664 from influxdata/jw-shard-size Store shard size on disk statistic	2016-05-18 22:39:12 +01:00
Jason Wilder	209dd005c5	Merge pull request #6627 from influxdata/jw-deadlock Fix possible deadlock when queries and delete series run concurrently	2016-05-18 15:30:37 -06:00
Jason Wilder	f2bcf9d9ab	Code review fixes	2016-05-18 15:25:56 -06:00
Jason Wilder	d32ad26d27	Fix data not getting reloaded The optimization to speed up shard loading had the side effect of skipping adding series to the index that already exist. The skipping was in the wrong location and also skipped the shards measurementFields index which is required in order to query that series in the shard.	2016-05-18 15:25:56 -06:00
Jason Wilder	e859141b75	Speed up tests Switched the max keys test to write int64 of the same value so RLE would kick in and the file size will be smaller (84MB vs 3.8MB). Removed the chunking test which was skipped because the code will not downsize a block into smaller chunks now. Skip MaxKeys tests in various environments because it needs to write too much data to run reliably.	2016-05-18 15:25:56 -06:00
Jason Wilder	eff71cbe23	Rollover to new TSM file when max blocks exceeded Fixes #6406	2016-05-18 15:25:55 -06:00
Jason Wilder	8fda621d8b	Fix memory spike when compacting overwritten points If a large series contains a point that is overwritten, the compactor would load the whole series into RAM during a full compaction. If the series was large, it could cause very large RAM spikes and OOMs. The change reworks the compactor to merge blocks more incrementally similar to the fix done in #6556. Fixes #6557	2016-05-18 15:25:55 -06:00
Jason Wilder	f1ab89561a	Reload series count stat at startup	2016-05-18 15:21:57 -06:00
Edd Robinson	28ad7c687b	Add const for interval	2016-05-18 22:14:59 +01:00
Jason Wilder	cbc551f9dc	Collect shard size stats	2016-05-18 22:14:59 +01:00
Jonathan A. Sternberg	946968ba23	Fixing panic in SHOW FIELD KEYS caused by `733a17d` The list of field keys in the index may have differed from the field keys in the actual shard. Fixing `SHOW FIELD KEYS` so it relies only on the shard rather than the index. Fixes #6659.	2016-05-18 14:43:50 -04:00
Edd Robinson	f78e67d09c	Fix concurrent map access panic	2016-05-18 17:56:50 +01:00
Edd Robinson	f680ab0f0d	Fix vet issues	2016-05-18 13:34:11 +01:00
Joe LeGasse	af432e7d12	Fix loop variable reuse in database close Fixes #6650	2016-05-17 11:25:39 -04:00
Jonathan A. Sternberg	42cdaf0365	Merge pull request #6529 from influxdata/js-6519-select-tag-key-specifier Support cast syntax for selecting a specific type	2016-05-16 12:30:14 -04:00
Jonathan A. Sternberg	23f6a706bb	Support cast syntax for selecting a specific type Casting syntax is done with the PostgreSQL syntax `field1::float` to specify which type should be used when selecting a field. You can also do `field1::field` or `tag1::tag` to specify that a field or tag should be selected. This makes it possible to select a tag when a field key and a tag key conflict with each other in a measurement. It also means it's possible to choose a field with a specific type if multiple shards disagree. If no types are given, the same ordering for how a type is chosen is used to determine which type to return. The FieldDimensions method has been updated to return the data type for the fields that get returned. The SeriesKeys function has also been removed since it is no longer needed. SeriesKeys was originally used for the fill iterator, but then expanded to be used by auxiliary iterators for determining the channel iterator types. The fill iterator doesn't need it anymore and the auxiliary types are better served by FieldDimensions implementing that functionality, so SeriesKeys is no longer needed. Fixes #6519.	2016-05-16 12:08:29 -04:00
Jason Wilder	ce141eae37	Merge pull request #6637 from influxdata/jw-revert-compact Revert "Fix memory spike when compacting overwritten points"	2016-05-16 09:46:24 -06:00
Jason Wilder	23fc9ff748	Revert "Fix memory spike when compacting overwritten points" This reverts commit `d99c5e26f6`.	2016-05-16 09:30:34 -06:00
Jonathan A. Sternberg	a17f3d960a	SHOW TAG VALUES accepts != and !~ in WHERE clause Fixes #6607.	2016-05-16 08:51:09 -04:00
Jason Wilder	57d4becaec	Fix possible deadlock when queries and delete series run concurrently This locks showeed up in a deadlock systems running queries and delete series across a large dataset. Queries should not need to lock the tsdb.Store for writes	2016-05-13 17:04:12 -06:00
Jason Wilder	5b6f3afefa	Limit concurrent shards loading to number of cores available	2016-05-13 15:41:32 -06:00
Jason Wilder	11871958c6	Merge pull request #6618 from influxdata/jw-shard-load Optimize shard index loading	2016-05-13 14:16:17 -06:00
Jason Wilder	9e54adc719	Speed up drop database Drop database was closing and deleting each shard dir individually and serially. It would then delete the empty database dirs. This changes drop database to close all shards in parallel and run one os.RemoveAll to remove everything under the db dir which is more efficient. This also reworked the locking to avoid locking the tsdb.Store for long periods of time. That can cause queries and writes for other databases to block as well.	2016-05-13 10:26:28 -06:00
Jason Wilder	0dbd4893da	Optimize shard index loading On data sets with many series and potentially large series keys, the cost of parsing the key and re-indexing can be high. Loading the TSM keys into the index was being done repeatedly for series that were already index by an earlier TSM file. This was wasted worked and slows down shard loading. Parsing the key was also innefficient and allocated a new string slice. This was simplified to remove that allocation.	2016-05-12 14:02:42 -06:00
Ben Johnson	7afb73aa99	Merge pull request #6598 from benbjohnson/parallelize-planning Parallelize query planning	2016-05-12 09:00:58 -06:00
Jonathan A. Sternberg	89346bb618	Merge pull request #6600 from influxdata/0.13 Merge 0.13 release candidate back to master	2016-05-11 13:04:26 -04:00
Ben Johnson	668bae57df	parallelize query planning This commit changes the `tsm1.Engine` to create individual series iterators in batches so that it can be parallelized. Iterators are combined at the end so they can be redistributed to the parallelized merge iterator.	2016-05-11 10:38:11 -06:00
Cory LaNou	c32906a366	Merge pull request #6593 from influxdata/cjl-copyshard create shard snapshot	2016-05-10 20:01:59 -05:00
Jonathan A. Sternberg	8353b0c20f	Merge pull request #6592 from influxdata/js-3451-show-field-keys-with-field-type Update SHOW FIELD KEYS to return the field type with the field key	2016-05-10 14:13:17 -04:00
Jason Wilder	d8490f1170	Merge pull request #6587 from influxdata/jw-validate-fields Fix for merge values	2016-05-10 11:56:07 -06:00
Jonathan A. Sternberg	733a17d9e9	Update SHOW FIELD KEYS to return the field type with the field key Fixes #3451.	2016-05-10 13:16:57 -04:00
Cory LaNou	f415cf89ad	wip	2016-05-10 11:01:03 -05:00
Jason Wilder	9b86bfea2a	Merge pull request #6582 from eleme/fix_engine_cache_size fix cache size of engine	2016-05-10 09:01:03 -06:00
Jason Wilder	8839cabd41	Add benchmark for Merge	2016-05-10 08:39:55 -06:00
Cory LaNou	4d30ea1eb3	minor PR feedback refactor	2016-05-10 08:14:51 -05:00
Cory LaNou	a3bf3e2ef1	added baseline backup/restore plumbing	2016-05-10 08:14:51 -05:00
Jason Wilder	4f39cb2f97	Fix case where Merge return unsorted values	2016-05-09 15:40:34 -06:00
Ben Johnson	078e561820	parallelize iterators	2016-05-09 10:25:30 -06:00
Jonathan A. Sternberg	3f4072be7a	Fix SHOW TAG VALUES condition to not filter "name" erroneously Before #6038 was merged, we needed to filter "name" so that it didn't accidentally hit the code path that used "name" to check the name of a measurement. This was changed to "_name" to avoid a conflict with a legitimate tag that used "name" as the key. SHOW TAG VALUES was never modified to remove the code that filtered out "name". This removes that line of code so a condition with "name" doesn't get removed erroneously. Example: SHOW TAG VALUES WITH KEY = host WHERE "name" = 'jsternberg' Fixes #6581.	2016-05-09 10:27:53 -04:00

1 2 3 4 5 ...

1126 Commits (2cbddb3efd851d408b9eb0676e168b27dbe7eb79)