influxdb

Commit Graph

Author	SHA1	Message	Date
Geoffrey Wossum	c7a591b57c	fix: prevent retention service from hanging (#25064 ) Fix issue that can cause the retention service to hang waiting on a `Shard.Close` call. When this occurs, no other shards will be deleted by the retention service. This is usually noticed as an increase in disk usage because old shards are not cleaned up. The fix adds to new methods to `Store`, `SetShardNewReadersBlocked` and `InUse`. `InUse` can be used to poll if a shard has active readers, which the retention service uses to skip over in-use shards to prevent the service from hanging. `SetShardNewReadersBlocked` determines if new read access may be granted to a shard. This is required to prevent race conditions around the use of `InUse` and the deletion of shards. If the retention service skips over a shard because it is in-use, the shard will be checked again the next time the retention service is run. It can be deleted on subsequent checks if it is no longer in-use. If the shards is stuck in-use, the retention service will not be able to delete the shards, which can be observed in the logs for manual intervention. Other shards can still be deleted by the retention service even if a shard is stuck with readers. closes: #25063 (cherry picked from commit `b4bd607eef`)	2024-06-13 16:08:02 -05:00
davidby-influx	8caeff7d0c	fix: ensure TSMBatchKeyIterator and FileStore close all TSMReaders (#24957 ) (#24963 ) Do not let errors on closing a TSMReader prevent other closes. (cherry picked from commit `82cbdb5478`) closes https://github.com/influxdata/influxdb/issues/24960	2024-05-06 10:45:00 -07:00
Sam Arnold	9e9f1be574	fix: remove dead iterator (#23888 )	2022-11-09 16:24:01 -05:00
davidby-influx	e53f75e06d	fix: discard excessive errors (#22379 ) The tsmBatchKeyIterator discards excessive errors to avoid out-of-memory crashes when compacting very corrupt files. Any error beyond DefaultMaxSavedErrors (100) will be discarded instead of appended to the error slice. closes https://github.com/influxdata/influxdb/issues/22328	2021-09-03 09:11:05 -07:00
davidby-influx	7d3efe1e9e	fix: avoid compaction queue stats flutter. (#22195 ) When the compaction planner runs, if it cannot acquire a lock on the files it plans to compact, it returns a nil list of compaction groups. This, in turn, sets the engine statistics for compactions queues to zero, which is incorrect. Instead, use the length of pending files which would have been returned. closes https://github.com/influxdata/influxdb/issues/22138	2021-08-16 09:21:07 -07:00
davidby-influx	73bdb2860e	chore: add logging to compaction (#21707 ) Compaction logging will generate intermediate information on volume of data written and output files created, as well as improve some of the anti-entropy messages related to compaction. This will also apply to `influx_tools compact` Closes https://github.com/influxdata/influxdb/issues/21704	2021-06-16 15:28:44 -07:00
Ben Johnson	3f59d4be8e	fix(tsdb): Fix error lists	2020-04-01 14:54:39 -06:00
elbehery	042128b948	fix(tsdb): Replace panic with error while de/encoding corrupt data fixes #17440 While encoding or decoding corrupt data, the current behaviour is to `panic`. This commit replaces the `panic` with `error` to be propagated up to the calling `iterator`. To avoid overwriting other `error`, iterators now wraps a `TSMErrors` which contains ALL the encountered errors. TSMErrors itself implements `Error()`, the returned string contains all the error msgs, separated by "," delimiter.	2020-04-01 20:51:11 +02:00
tmgordeeva	f1d26652e9	fix(storage): skip TSM files with block read errors (#15885 ) * fix(storage): skip TSM files with block read errors When we find a bad TSM file during compaction, propagate the error up and move the bad file aside. The engine will disregard the file so the next compaction will not hit the same error.	2019-12-13 15:05:39 -08:00
Stuart Carnie	86734e7fcd	fix(storage): Don't panic when length of source slice is too large StringArrayEncodeAll will panic if the total length of strings contained in the src slice is > 0xffffffff. This change adds a unit test to replicate the issue and an associated fix to return an error. This also raises an issue that compactions will be unable to make progress under the following condition: * multiple string blocks are to be merged to a single block and * the total length of all strings exceeds the maximum block size that snappy will encode (0xffffffff) The observable effect of this is errors in the logs indicating a compaction failure. Fixes #13687	2019-06-04 17:05:01 -07:00
Ben Johnson	2dd913d71b	Revert "Limit force-full and cold compaction size." This reverts commit `40db64d0b9`.	2019-02-11 11:07:44 -07:00
Ben Wells	e9bada090f	Fix misspelling identified by misspell	2019-02-03 20:27:43 +00:00
Ben Johnson	40db64d0b9	Limit force-full and cold compaction size. This commit limits the number of files that can be compacted in a single group when forcing a full compaction or when a shard becomes cold. This is to prevent too many files being compacted at the same time.	2018-12-05 10:18:56 -07:00
Edd Robinson	be662a5853	Fix TSM index maxtime modification	2018-10-29 15:44:31 +00:00
Edd Robinson	5054d6fae4	Address PR feedback	2018-10-16 13:37:49 +01:00
Edd Robinson	09da18c08e	Add TSM batch key iterator The batch focussed TSM key iterator iterates TSM blocks, decoding and merging blocks where appropriate using the the batch focussed approaches.	2018-10-16 12:08:12 +01:00
Jeff Wendling	1c0e49e002	tsm1: ensure all written tsm files are fsynced we were asserting to an *os.File in order to call Sync, but in some cases the file handle has been wrapped, for example with limiting. instead, assert to minimal interfaces for the functionality we need and attempt to add some robustness in the code that creates the writers by using a stronger interface with a Sync method. fixes #9991	2018-06-25 11:36:22 -06:00
Jacob Marble	bb313765e4	tsdb/tsm1: Clean up TSM filename format/parse	2018-05-29 09:57:48 -07:00
Jeff Wendling	1a8931af42	Merge pull request #9841 from influxdata/jmw-ensure-no-race-conditions tsm1: ensure some race conditions are impossible	2018-05-16 11:56:10 -06:00
Jeff Wendling	7d2bb19b74	tsm1: ensure some race conditions are impossible The InUse call on TSMFiles is inherently racy in the presence of Ref calls outside of the file store mutex. In addition, we return some TSMFiles to callers without them being Ref'd which might allow them to be closed from underneath. While I believe it is the case that it would be impossible, as the only thing that gets a handle externally is compaction, and compaction enforces that only one handle exists at a time, and thus is only deleted once after the compaction is done with it, it's not very obvious or enforced. Instead, always return a TSMFile with a Ref call under the read lock, and require that no one else calls Ref. That way, it cannot transition to referenced if the InUse call returns false under the write lock. The CreateSnapshot method was racy in a number of ways in the presence of multiple calls or compactions: it did not take references to the TSMFiles, and the temporary directory it creates could have been shared with concurrent CreateSnapshot calls. In addition, the files slice could have been concurrently mutated during a compaction as well. Instead, under the write lock, make a local copy of the state for the compaction, including Ref calls (write locks are implicitly read locks). Then, there is no need for a lock at all afterward. Add some comments to explain these issues at the call sites of InUse, and document that the Files method that returns the slice unprotected is only for tests.	2018-05-14 19:45:42 -06:00
Ben Johnson	35a64dee99	Inject tsm file naming.	2018-05-14 10:46:38 -06:00
Jason Wilder	0eb6564e79	Add extension point to swap out the compaction planner	2018-03-21 15:51:00 -06:00
hpbieker	c892bf15a1	Fix missing sorting of blocks when compacting.	2018-01-03 10:21:11 +01:00
Jason Wilder	2d85ff1d09	Adjust compaction planning Increase level 1 min criteria, fix only fast compactions getting run, and fix very large generations getting included in optimize plans.	2017-12-14 22:41:34 -07:00
Jason Wilder	749c9d2483	Rate limit disk IO when writing TSM files This limits the disk IO for writing TSM files during compactions and snapshots. This helps reduce the spiky IO patterns on SSDs and when compactions run very quickly.	2017-12-14 22:02:32 -07:00
Jason Wilder	7dc5327a0a	Adjust snapshot concurrency by latency This changes the approach to adjusting the amount of concurrency used for snapshotting to be based on the snapshot latency vs cardinality. The cardinality approach could use too much concurrency and increase the number of level 1 TSM files too quickly which incurs more disk IO. The latency model seems to adjust better to different workloads.	2017-12-13 13:17:56 -07:00
Jason Wilder	9f2a422039	Use disk based TSM index more selectively The disk based temp index for writing a TSM file was used for compactions other than snapshot compactions. That meant it was used even for smaller compactiont that would not use much memory. An unintended side-effect of this is higher disk IO when copying the index to the final file. This switches when to use the index based on the estimated size of the new index that will be written. This isn't exact, but seems to work kick in at higher cardinality and larger compactions when it is necessary to avoid OOMs.	2017-12-06 13:45:43 -07:00
Jason Wilder	0a85ce2b73	Schedule compactions less aggressively This runs the scheduler every 5s instead of every 1s as well as reduces the scope of a level 1 plan.	2017-12-06 13:45:43 -07:00
Jason Wilder	9c1d7d00a9	Switch O_SYNC to periodic fsync O_SYNC was added with writing TSM files to fix an issue where the final fsync at the end cause the process to stall. This ends up increase disk util to much so this change switches to use multiple fsyncs while writing the TSM file instead of O_SYNC or one large one at the end.	2017-12-06 09:35:24 -07:00
Jason Wilder	b6096414c2	Fix compactions aborting early If there were many individual deletes to a series that ended up deleting every value in the block and the tombstone timestamps were not contigous, it was possible for the TSMKeyIterator to return false for Next incorrectly. This causes the compaction to drop any remaining data in the file. Normally, if all the data is deleted via tombstones, we remove the whole key from the TSM index. In this case, we're not able to determine that the key is fully deleted until the block is decode and tombstones are applied. This changes the TSMKeyIterator to detect this condition and continue to the next key instead of aborting.	2017-11-30 14:38:09 -07:00
Edd Robinson	12a2ff7fac	Add support for TSI shard streaming and shard size This commit firstly ensures that a shard's size on disk is accurately reported when using the tsi1 index, by including the on-disk size of the tsi1 index in the calculation. Secondly, this commit add support for shard streaming/copying when using the tsi1 index. Prior to this, a tsi1 index would not be correctly restored when streaming shards.	2017-11-28 15:57:02 +00:00
Jason Wilder	97e0d496a6	Add capability to force a full compaction This adds the capability to the engine to force a full compaction to be scheduled. When called, it snapshots any data in the cache, aborts running compactions and prevents level plans from returning level plans.	2017-11-15 07:14:27 -07:00
Jason Wilder	9f102adabe	Abort BlockIterator iteration if deletes detected This fixes a potential bug where the BlockIterator would skip blocks if the underlying TSMReader had deletes on it concurrently. This could possibly occur due to changes in `91eb9de3` that now use the existing TSMReaders from the FileStore instead of creating new ones during compaction.	2017-10-18 18:16:37 -06:00
lrita	2f0aa4a420	remove duplicated code in cacheKeyIterator.encode()	2017-10-13 20:39:15 +08:00
Jason Wilder	00a403f60e	Reduce allocation in tsmKeyIterator.Next This reuses some intermediate buffers and structs while compacting files.	2017-10-04 17:35:56 -06:00
Jason Wilder	06226d6fd3	Handle orphan lower level TSM files during full planning Some files seem to get orphan behind higher levels. This causes the compactions to get blocked as the lowere level files will not get picked up by their lower level planners. This allows the full plan to identify them and pull them into their plans.	2017-10-04 08:13:14 -06:00
Jason Wilder	0c0505881f	Remove multiple file skipping for full compaction planning This check doesn't make sense for high cardinality data as the files typically get big and sparse very quickly. This causes a lot of extra disk space to be used which is taken up by large indexes and sparse data.	2017-10-03 10:48:14 -06:00
Jason Wilder	4ff4ba0841	Use first file in generation for level With higher cardinality or larger series keys, the files can roll over early which causes them to take longer to be compacted by higher levels. This causes larger disk usage and higher numbers of tsm files at times.	2017-10-03 10:48:14 -06:00
Jason Wilder	2c5006fccc	Rework snapshotting concurrency This switches the thresholds that are used for writing snapshots concurrently. This scales better than the prior model.	2017-10-03 10:48:14 -06:00
Jason Wilder	70817350b7	Ensure temp index files are cleaned up on error	2017-10-03 10:48:14 -06:00
Jason Wilder	ae821f4e2d	Rework compaction scheduling This changes the compaction scheduling to better utilize the available cores that are free. Previously, a level was planned in its own goroutine and would kick off a number of compactions groups. The problem with this model was that if there were 4 groups, and 3 completed quickly, the planning would be blocked for that level until the last group finished. If the compactions at the prior level are running more quickly, a large backlog could accumlate. This now moves the planning to a single goroutine that plans each level in succession and starts as many groups as it can. When one group finishes, the planning will start the next group for the level.	2017-10-03 10:48:13 -06:00
Jason Wilder	1610ae5727	Don't return tsm files part of a compaction plan	2017-10-03 10:48:13 -06:00
Jason Wilder	122a74c692	Use synchronous IO for wal and tsm writing The fysncs due to large writes when writing to TSM files and the WAL can eventually cause large pauses. Since we already buffer writes, using synchronous IO reduces fsync latency by ensuring the individiual writes hit disk. This spreads out the latecncy across multiple writes better.	2017-09-25 12:44:57 -06:00
Jason Wilder	db204f3eb7	Default concurrent compactions to 50% of available cores	2017-09-21 12:48:11 -06:00
Jason Wilder	796de3dcea	Reduce encoder pool checkout contention With higher cardinalities, the encoder pools where become a bottleneck. This changes the snapshot compactions ot checkout one encoder of each type and re-use it while writing the snapshots as opposed to repeatedly checking it out and in.	2017-09-19 15:27:26 -06:00
Jason Wilder	391a6288c6	Write parallel snapshot for higher cardinalities	2017-09-19 15:27:26 -06:00
Jason Wilder	ddeba2c86b	Split large snapshots and write concurrently	2017-09-19 15:27:25 -06:00
Jason Wilder	91eb9de341	Use existing TSMReader from file store during compactions Compactions would create their own TSMReaders for simplicity. With very high cardinality compactions, creating the reader and indirectIndex can start to use a significant amount of memory. This changes the compactions to use a reader that is already allocated and managed by the FileStore.	2017-09-11 15:29:25 -06:00
Jason Wilder	a8d9eeef36	Reduce lock contention when deleting high cardinality series Deleting high cardinality series could take a very long time, cause write timeouts as well as dead lock the process. This fixes these issue to by changing the approach for cleaning up the indexes and reducing lock contention. The prior approach delete each series and updated every index (inmem) during the delete. This was very slow and cause the index to be locked while it items in a slice were removed one by one. This has been changed to mark series as deleted and then rebuild the index asynchronously which speeds up the process. There was also a dead lock that could occur when deleing the field set. Deleting the field set held a write lock and the function it invoked under the lock could try to take a read lock on the field set. This would then deadlock. This approach was also very slow and caused time out for writes. It now uses faster approach that checks for the existing of the measurment in the cache and filestore which does not take write locks.	2017-09-07 11:36:02 -06:00
Jason Wilder	e265d150be	Fix leaking tmp file when large compaction aborted If a large compaction was running and was aborted. It could would leave some tmp files around for files that it had fully written. The current active file was cleaned up, but already completed ones would not. This would occur when a TSM file needed to rollover due to size.	2017-08-21 17:04:57 -06:00

1 2 3 4

158 Commits (1.11)