influxdb

Commit Graph

Author	SHA1	Message	Date
Jeff Wendling	8ad515b387	tsdb: remove the shard id again callers can always ensure that the observer set on the engine options is appropriate for that shard id. this simplifies the api and reduces the chance of bugs due to mixing up shard ids.	2018-05-23 13:04:54 -06:00
Jeff Wendling	e62b1a02fb	Merge pull request #9879 from influxdata/jmw-add-shard-number-to-observer tsdb: add shard number to the observer	2018-05-21 20:00:47 -06:00
michaelyou	efc324681a	Typo	2018-05-20 22:37:22 +08:00
Jeff Wendling	eb4bf651e5	tsdb: add shard number to the observer an observer may want to know what shard the file is part of. this way, they don't have to rely on brittle file path parsing.	2018-05-18 18:15:44 -06:00
Jeff Wendling	6320316fd4	Merge pull request #9852 from influxdata/jmw-tsm-notifications file store: send notifications about new/deleted tsm files.	2018-05-18 11:29:34 -06:00
Jacob Marble	3f2ff742c0	Remove unused 'database' field	2018-05-18 09:22:43 -07:00
Jeff Wendling	27040d6f31	file store: send notifications about new/deleted tsm files. just adds some interface for hooks about when these files come and go. we do them before the action is taken so that if the hook has an error, it doesn't have any consistency problems.	2018-05-17 12:19:58 -06:00
Jeff Wendling	3fc40dd4a0	Merge pull request #9824 from influxdata/jmw-optimize-radix radix: optimize for our use case	2018-05-16 13:43:30 -06:00
Ben Johnson	8838d284a5	Merge pull request #9826 from influxdata/bj-tsm-filename TSM Filename Injection	2018-05-15 15:50:26 -06:00
Jacob Marble	7f8b7af61e	Cleanup index memory footprint counting code (#9828 ) * Fix IndexSet.DedupeInmemIndexes * Cleanup index memory footprint code	2018-05-15 11:25:19 -07:00
Ben Johnson	35a64dee99	Inject tsm file naming.	2018-05-14 10:46:38 -06:00
Jeff Wendling	cb9c3ee509	radix: optimize for our use case - reduce allocations by making leaf a value type with a bool - make longestPrefix inlineable and have no bounds checks - delete any code for functions we don't plan to use - operate on []byte and only copy when necessary - inline calls to sort.Search to avoid allocations and indirections - insert directly in the correct location for addEdge - reduce allocations during copying with a buffer helper results: name old time/op new time/op delta Tree_Insert-8 1.10ms ± 4% 0.73ms ± 4% -33.54% (p=0.000 n=10+10) Tree_InsertNew-8 3.18ms ± 2% 1.91ms ± 6% -39.90% (p=0.000 n=10+10) name old speed new speed delta Tree_Insert-8 9.12MB/s ± 4% 13.72MB/s ± 4% +50.46% (p=0.000 n=10+10) Tree_InsertNew-8 3.15MB/s ± 2% 5.24MB/s ± 6% +66.42% (p=0.000 n=10+10) name old alloc/op new alloc/op delta Tree_InsertNew-8 1.62MB ± 0% 1.60MB ± 0% -1.28% (p=0.000 n=10+9) name old allocs/op new allocs/op delta Tree_InsertNew-8 35.0k ± 0% 15.0k ± 0% -57.04% (p=0.000 n=10+10) MB/sec in this case is 1 byte per key inserted, so it's really millions of keys inserted per second.	2018-05-11 11:56:11 -06:00
Jacob Marble	0763d1789e	Get inmem index bytes without double-counting	2018-05-10 11:33:52 -07:00
Jason Wilder	de58584ce7	Merge pull request #9748 from influxdata/jw-series-type Prevent series type conflict	2018-05-10 07:05:45 -06:00
Jacob Marble	148341fb2a	tsdb/WAL: Better respect for WAL disabled	2018-05-08 15:04:33 -07:00
Jason Wilder	aea9bf3464	Hide series type map behind feature flag The performance is not good enough to enable by default so this allows the functionality to be merged while performance is improved.	2018-05-02 06:50:35 -06:00
Jason Wilder	2be2418b89	Add series type validation to Engine This is the start of per-series validation that occurs in the Engine write path. It uses an in-memory radix tree to reduce memory usage and is re-built on demand the first time a series is written.	2018-04-30 17:26:23 -06:00
Stuart Carnie	e0ae9c5a2d	tsm1: Replace goroutine `merge` with k-way merge Previously replaced WalkKeys implementation for a considerable improvement to startup time	2018-04-30 07:57:55 -07:00
Jason Wilder	97ecf62ffb	Return time range from delete predicate func This moves the time range to delete to be returned by the predicate func in DeleteSeriesRangeWithPredicate. It allows for a single delete to delete different ranges of times per series instead of a single range of time for all series.	2018-04-09 20:01:33 -06:00
Adam	72bceca888	Fix stream package to allow for renaming the file before writing it to the stream (#9684 ) * Fix stream package to allow for renaming the file before writing it to the stream * updated test to make sure that the final tsm file has more than one block	2018-04-05 16:24:29 -04:00
Ben Johnson	db9d32e514	Ignore index size in Engine.DiskSize(). TSM includes index in DiskSize(), however, indexes are not copied and shouldn't be included in this method. This causes issues with `copy-shard`.	2018-03-29 13:03:48 -06:00
Jacob Marble	470ee7f176	Add ability to delete many series with predicate	2018-03-28 08:32:18 -07:00
Jason Wilder	477de23e35	Merge pull request #9609 from influxdata/jw-compaction-filter Add capability change compaction planner	2018-03-22 07:30:52 -06:00
Jason Wilder	0eb6564e79	Add extension point to swap out the compaction planner	2018-03-21 15:51:00 -06:00
Stuart Carnie	aa61359cc7	Storage RPC API improvements. See PR for details * reduce # allocations (115M -> 22M) * reduce size allocations (53GB -> 1.3GB) * reduce RPC query time (45s -> 12.9s)	2018-03-21 13:46:09 -07:00
Edd Robinson	7c3ae91d1e	Merge pull request #9551 from influxdata/er-fieldset-panic Fix panic when checking fieldsets	2018-03-12 17:28:58 +00:00
Jason Wilder	444ad747b6	Add option to disable WAL This adds an internal option (not exposed via config) to disable the WAL when using the TSM engine directly.	2018-03-12 09:48:11 -06:00
Edd Robinson	c1e1412dae	Don't panic when checking for field	2018-03-12 15:25:20 +00:00
Stuart Carnie	e493a3e1db	use child logger	2018-02-27 20:27:24 -07:00
Stuart Carnie	48fb2a4cc5	Merge pull request #9487 from influxdata/sgc-tagsets fallback to inmem TagSets implementation	2018-02-27 09:06:54 -07:00
Stuart Carnie	b72e0c5941	fallback to inmem TagSets implementation	2018-02-27 07:49:51 -07:00
Edd Robinson	96c0ecf618	Improve startup time of `inmem` index This commit improves the startup time when using the `inmem` index by ensuring that the series are created in the index and series file in batches of 10000, rather than individually. Fixes #9486.	2018-02-27 13:33:00 +00:00
Stuart Carnie	b03cf6a953	prefix with `tsm1_` for consistency	2018-02-26 13:00:03 -07:00
Stuart Carnie	a74d296200	use underscore vs period, fix doc comment, add database name to CQ	2018-02-26 10:08:43 -07:00
Stuart Carnie	d40d3ecc2e	Merge pull request #9456 from influxdata/sgc-logging Generate trace logs for a number of important InfluxDB operations	2018-02-21 15:09:18 -07:00
Stuart Carnie	d135aecf02	Generate trace logs for a number of significant influx operations * tsdb Store.Open traces all events related to opening files * op.name : tsdb.open * retention policy shard deletions * op.name : retention.delete_check * all TSM compaction strategies * op.name : tsm1.compact_group * series file compactions * op.name : series_partition.compaction * continuous query execution (if logging enabled) * op.name : continuous_querier.execute * TSI log file compaction * op_name: index.tsi.compact_log_file * TSI level compaction * op.name: index.tsi.compact_to_level	2018-02-21 15:08:49 -07:00
Jason Wilder	fd90ec2b04	Remove noisy trace logging in TSM engine This logging is noisy and allocates a lot of garbage. There are stats now that have the same information.	2018-02-21 12:51:01 -07:00
Jonathan A. Sternberg	2bbd96768d	Update logging calls to take advantage of structured logging Includes a style guide that details the basics of how to log.	2018-02-20 10:04:19 -06:00
Stuart Carnie	6e47ff8d7f	simplify code	2018-02-14 06:55:48 -07:00
Edd Robinson	544329380f	Add empty series sketches back to tsi1 index This commit adds initial empty sketches back to the tsi1 index, as well as ensuring that ephemeral sketches in the index `LogFile` are updated accordingly. The commit also adds a test that verifies that the merged sketches at the store level produce the correct results under writes, deletions and re-opening of the store. This commit does not provide working sketches for post-compaction on the tsi1 index.	2018-02-07 14:52:13 -07:00
Joe LeGasse	21a58235fc	Merge branch 'master' into jl-race	2018-01-29 15:52:18 -05:00
Edd Robinson	821b784fa0	Switch deprecated HasPrefix for raw string check	2018-01-21 12:08:25 -08:00
Edd Robinson	42c3adeffc	simplify packages under tsdb	2018-01-21 09:41:27 -08:00
Edd Robinson	90903fa6ed	Remove unused code/cleanup engine package	2018-01-20 13:56:45 +00:00
Jason Wilder	97f61e0ff4	Allow SeriesFile compaction to be disabled	2018-01-18 15:54:52 -07:00
Jason Wilder	d755daede8	Add ability to enable/disable tsi compactions	2018-01-18 14:25:58 -07:00
Joe LeGasse	425a5e5f17	tsm1: prevent WaitGroup race	2018-01-17 13:08:11 -05:00
Jason Wilder	b05754fd23	Fix nil pointer panic Under concurrent writes and deletes of the same series, a nil panic could occur in bytes.Compare. Instead of setting the seriesKeys to nil, set them to an 0 length slice which prevents the panic.	2018-01-17 07:57:30 -07:00
Jason Wilder	5d6b8fc834	Drop measurement after series This separates out the dropping of a measurement from the series to avoid frequent checks to see if a measurement still has series. The series are dropped individually and we keep track of which measurements are involved and then delete each measurment afterwards.	2018-01-17 07:57:25 -07:00
Jason Wilder	1c8676b4a3	Rebuild corrupted fields index when necessary If the fields.idx was corrupted in someway, it would cause the shard to fail to load. Deleting the file will allow it to be rebuilt. This change handles this automatically so it's rebuilt if necessary without user intervention.	2018-01-16 11:31:07 -07:00
Edd Robinson	a2ece0a49a	Pass series id in via Index API	2018-01-15 12:00:31 +00:00
Ben Johnson	d295f30686	Remove series id check during deletion.	2018-01-15 12:00:31 +00:00
Edd Robinson	bb6bfad5ea	Ensure inmem index updated properly	2018-01-15 12:00:30 +00:00
Edd Robinson	b9d0a39131	Skip empty series keys	2018-01-15 12:00:30 +00:00
Edd Robinson	a4bef3a4bc	Refactoring delete tests	2018-01-15 12:00:30 +00:00
Jason Wilder	ba9a5af7eb	Mark series deleted in series file This commit adds the ability to correctly mark a series as deleted in the global series file. Whenever a shard engine determines that a series should be deleted, it checks with each shard's bitset for series that are to be deleted and are no longer contained in any shard-local bitsets. These series are then removed from the series file.	2018-01-15 12:00:30 +00:00
Edd Robinson	286c8f4c09	Return to original DELETE/DROP SERIES semantics This reverts commit `59afd8cc90`.	2018-01-15 12:00:30 +00:00
Jason Wilder	c2cbd14e09	Fix TestEngine_DisableEnableCompactions_Concurrent hang This test could hang due to an existing race that is still not fixed. The snapshot and level compaction goroutines woule end up waiting on the wrong channel to be closed so whey would never exit.	2018-01-11 11:58:20 -07:00
Edd Robinson	ed8b9925c8	Comment update	2018-01-11 01:01:54 +00:00
Edd Robinson	e610e7c21d	Track undeleted series IDs per-shard with inmem This commit adds a bitset into each shard's in-memory index, to be used to track undeleted series ids. Currently tsi1 support is not implemented. When new series are added to the shard, the series id is added to the bitset. When series are deleted from the shard, the series ids are removed from the bitset. Becasue each shard shares the same inmem index reference, the bitset is stored in the `ShardIndex`, which is local to each shard, and then different references are passed into the shared `Index` object, depending on which shard is writing the series.	2018-01-11 01:01:54 +00:00
Adam	938db68198	Update restore functionality to run in online mode, consume Enterprise backup files. (#9207 ) * Live Restore + Enterprise data format compatability * Extended ImportData to import all DB's if no db name given * Added a new enterprise data test, and backup command now prints the backup file paths at conclusion * Added whole-system backup test * Update to use protobuf in all enterprise data cases * Update to test to do cross-testing with enterprise version * incremental enterprise backup format support	2018-01-10 13:59:18 -05:00
David Norton	1c452d83cb	fix #9286 : return digest size	2018-01-08 13:15:14 -05:00
Stuart Carnie	ed207b54c3	updates after TSI / series file merge	2017-12-29 10:58:25 -07:00
Stuart Carnie	5dfe3b2645	inmem startup improvments * only call ParseTags when necessary * remove dependency on inmem.Series in tsdb test package * Measurement and Series are no longer exported. Their use is restricted to the inmem package * improve Measurement and Series types by exporting immutable fields and removing unnecessary APIs and locks Reduced startup time from 28s to 17s. Overall improvement including #9162 reduces startup from 46s to 17s for 1MM series across 14 shards.	2017-12-29 07:58:52 -07:00
Ben Johnson	d8b1d208c0	rebase	2017-12-20 15:13:34 -07:00
Edd Robinson	c476a0b4a1	Merge branch 'master' into er-tsi-index-part	2017-12-15 18:31:24 +00:00
Jason Wilder	2d85ff1d09	Adjust compaction planning Increase level 1 min criteria, fix only fast compactions getting run, and fix very large generations getting included in optimize plans.	2017-12-14 22:41:34 -07:00
Jason Wilder	749c9d2483	Rate limit disk IO when writing TSM files This limits the disk IO for writing TSM files during compactions and snapshots. This helps reduce the spiky IO patterns on SSDs and when compactions run very quickly.	2017-12-14 22:02:32 -07:00
Edd Robinson	59afd8cc90	Return to original DELETE/DROP SERIES semantics Since possibly v0.9 DELETE SERIES has had the unwanted side effect of removing series from the index when the last traces of series data are removed from TSM. This occurred because the inmem index was rebuilt on startup, and if there was no TSM data for a series then there could be not series to add to the index. This commit returns to the original (documented) DROP/DETETE SERIES behaviour. As such, when issuing DROP SERIES all instances of matching series will be removed from both the TSM engine and the index. When issuing DELETE SERIES only TSM data will be removed. It is up to the operator to remove series from the index. NB, this commit does not address how to remove series data from the series file when a shard rolls over.	2017-12-15 00:02:06 +00:00
Edd Robinson	9e3b17fd09	Ensure deleted series are not returned via iterators	2017-12-14 21:29:35 +00:00
Jason Wilder	7dc5327a0a	Adjust snapshot concurrency by latency This changes the approach to adjusting the amount of concurrency used for snapshotting to be based on the snapshot latency vs cardinality. The cardinality approach could use too much concurrency and increase the number of level 1 TSM files too quickly which incurs more disk IO. The latency model seems to adjust better to different workloads.	2017-12-13 13:17:56 -07:00
David Norton	253ea7cc5e	feat #9212 : fix file in use bug on Windows	2017-12-13 09:29:07 -05:00
David Norton	4e13248d85	feat #9212 : add ability to generate shard digests	2017-12-13 09:28:34 -05:00
Edd Robinson	f1bcc97e89	Fix auth tests	2017-12-12 21:25:35 +00:00
Edd Robinson	7d13bf3262	merge master	2017-12-08 17:21:58 +00:00
Edd Robinson	f6835632e7	Merge master into branch	2017-12-08 17:11:07 +00:00
Edd Robinson	3318c94a2f	Clean up 🛁:	2017-12-08 11:38:53 +00:00
Adam	a0b2195d6b	Pulled in backup-relevant code for review (#9193 ) for issue #8879	2017-12-07 11:35:20 -05:00
Jason Wilder	0a85ce2b73	Schedule compactions less aggressively This runs the scheduler every 5s instead of every 1s as well as reduces the scope of a level 1 plan.	2017-12-06 13:45:43 -07:00
Ben Johnson	493c1ed0d1	inmem tests passing.	2017-12-05 10:49:58 -07:00
Jason Wilder	909a2fb6cc	Fix deletes removing index for invalid time ranges If a delete for a time that does not exist was run, we would not remove the series key from the slice of series to remove from the index. This could be triggered by running somethin like "delete from cpu where time = 0" and if there was no data at time 0, the series would still be removed from the index.	2017-11-30 15:01:01 -07:00
Andrew Hare	761a8f8bec	Schedule a full compaction after a successful import	2017-11-29 13:50:38 -07:00
Ben Johnson	ca09f18e65	intermediate: tsdb compile	2017-11-29 11:20:18 -07:00
Jason Wilder	8633e38549	Fix removing series from index The loop to check if a series still exists in a TSM file was wrong in that it 1) exited early after one iteration and 2) had an off by one error that causes the wrong series to be marked as existing. This fixes both of these cases which can cause the index to become inconsistent with the data store on disk.	2017-11-29 10:45:04 -07:00
Edd Robinson	81976bca59	Refactor based on new design	2017-11-28 17:54:29 +00:00
Edd Robinson	b10249a9b3	Fix rebase	2017-11-28 15:58:35 +00:00
Edd Robinson	041a3837be	Ensure index can track fields	2017-11-28 15:57:03 +00:00
Edd Robinson	38e0dd695f	Allow concurrent access to Engine Index	2017-11-28 15:57:03 +00:00
Edd Robinson	12a2ff7fac	Add support for TSI shard streaming and shard size This commit firstly ensures that a shard's size on disk is accurately reported when using the tsi1 index, by including the on-disk size of the tsi1 index in the calculation. Secondly, this commit add support for shard streaming/copying when using the tsi1 index. Prior to this, a tsi1 index would not be correctly restored when streaming shards.	2017-11-28 15:57:02 +00:00
Jason Wilder	b59858e529	Ensure series keys are sorted before searching The Cache.ApplyEntryFn iterates keys according to the partitions and hashed values. This can cause the deleteKeys slice to contain unsorted keys when deleting series. The code uses a binary search on this slice later on and this can fail to detect that the series should still exists. The series is then removed from the index even though it has data still. Fixes #9116	2017-11-27 17:06:03 -07:00
Edd Robinson	e6b7140d65	Merge pull request #9143 from influxdata/er-show-tag-key-perf SHOW TAG KEYS with high cardinality and many shards	2017-11-27 15:04:15 +00:00
Stuart Carnie	7cdfd95966	initial opentrace implementation for ifql interface NOTE: does not include a default tracer until configuration across projects is standardized	2017-11-22 14:42:26 -07:00
Jason Wilder	cacb55fac4	Fix typos	2017-11-22 11:17:34 -07:00
Jason Wilder	dd1c030815	Remove limit count param on fields It's not used anymore.	2017-11-22 11:17:34 -07:00
Jason Wilder	c14b0e81b7	Save field types to speed up startup This persists the field types in a shard to avoid having to scan all the TSM files at startup.	2017-11-22 11:17:34 -07:00
Edd Robinson	68dd5e27c8	Improve performance of TagKeys	2017-11-21 17:16:47 +00:00
Jason Wilder	50b6ace75f	Fix wait reused while disabling compactions	2017-11-20 14:55:47 -07:00
Edd Robinson	6851db3fc9	Add FGA support to SHOW MEASUREMENTS	2017-11-17 11:06:43 +00:00
Ben Johnson	ede3fcf98e	intermediate	2017-11-15 16:09:25 -07:00
Jason Wilder	97e0d496a6	Add capability to force a full compaction This adds the capability to the engine to force a full compaction to be scheduled. When called, it snapshots any data in the cache, aborts running compactions and prevents level plans from returning level plans.	2017-11-15 07:14:27 -07:00

1 2 3 4 5 ...

463 Commits (master)