Commit Graph

103 Commits (db/shards_persisting_chaos_testing)

Author SHA1 Message Date
Geoffrey Wossum 1fbe319080
fix: reduce excessive CPU usage during compaction planning (#26432)
Co-authored-by: devanbenz <devandbenz@gmail.com>
2025-05-27 16:55:20 -05:00
Geoffrey Wossum 66f4dbeaad
fix: limit number of concurrent optimized compactions (#26319)
Limit number of concurrent optimized compactions so that level compactions do not get starved. Starved level compactions result in a sudden increase in disk usage.

Add [data] max-concurrent-optimized-compactions for configuring maximum number of concurrent optimized compactions. Default value is 1.

Co-authored-by: davidby-influx <dbyrne@influxdata.com>
Co-authored-by: devanbenz <devandbenz@gmail.com>
Closes: #26315
2025-05-06 15:42:39 -05:00
WeblWabl 96e44cac73
fix: PlanOptimize is running too frequently (#26211)
PlanOptimize is being checked far too frequently. This PR is the simplest change that can be made in order to ensure that PlanOptimize is not being ran too much. To alleviate the frequency I've added a lastWrite parameter to PlanOptimize and added an additional test that mocks the edge cause out in the wild that led to this PR.

Previously in test cases for PlanOptimize I was not checked to see if certain cases would be picked up by Plan I've adjusted a few of the existing test cases after modifying Plan and PlanOptimize to have the same lastWrite time.
2025-04-08 12:22:29 -05:00
WeblWabl d8bcbd894c
feat: Add CompactPointsPerBlock config opt (#26100)
* feat: Add CompactPointsPerBlock config opt
This PR adds an additional parameter for influxd
CompactPointsPerBlock. It adjusts the DefaultAggressiveMaxPointsPerBlock
to 10,000. We had discovered that with the points per block set to
100,000 compacted TSM files were increasing. After modifying the
points per block to 10,000 we noticed that the file sizes decreased.
The value has been set as a parameter that can be adjusted by administrators
this allows there to be some tuning if compression problems are encountered.
2025-03-05 14:59:06 -06:00
davidby-influx 2ab5aad52e
chore: add logging to Filestore.purger (#26089)
Also fixes error type checks in
TestCompactor_CompactFull_InProgress
2025-03-05 11:46:07 -08:00
davidby-influx 800970490a
fix: move aside TSM file on errBlockRead (#25839)
The error type check for errBlockRead was incorrect,
and bad TSM files were not being moved aside when
that error was encountered. Use errors.Join,
errors.Is, and errors.As to correctly unwrap multiple
errors.

Closes https://github.com/influxdata/influxdb/issues/25838
2025-01-22 10:46:31 -08:00
WeblWabl f04105bede
feat: Modify optimized compaction to cover edge cases (#25594)
* feat: Modify optimized compaction to cover edge cases
This PR changes the algorithm for compaction to account for the following
cases that were not previously accounted for:

- Many generations with a groupsize over 2 GB
- Single generation with many files and a groupsize under 2 GB
- Where groupsize is the total size of the TSM files in said shard directory.
- shards that may have over a 2 GB group size but
many fragmented files (under 2 GB and under aggressive
point per block count)

closes https://github.com/influxdata/influxdb/issues/25666
2025-01-14 14:51:09 -06:00
davidby-influx e974165d25
fix: do not leak file handles from Compactor.write (#25725)
There are a number of code paths in Compactor.write which
on error can lead to leaked file handles to temporary files.
This, in turn, prevents the removal of the temporary files until
InfluxDB is rebooted, releasing the file handles.

closes https://github.com/influxdata/influxdb/issues/25724
2025-01-03 14:43:41 -08:00
Geoffrey Wossum b4bd607eef
fix: prevent retention service from hanging (#25055)
* fix: prevent retention service from hanging

Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.

The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.

If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.

closes: #25054
2024-06-13 11:07:17 -05:00
Sam Arnold 9e9f1be574
fix: remove dead iterator (#23888) 2022-11-09 16:24:01 -05:00
davidby-influx 7d3efe1e9e
fix: avoid compaction queue stats flutter. (#22195)
When the compaction planner runs, if it cannot acquire
a lock on the files it plans to compact, it returns a
nil list of compaction groups. This, in turn, sets the
engine statistics for compactions queues to zero,
which is incorrect. Instead, use the length of pending
files which would have been returned.

closes https://github.com/influxdata/influxdb/issues/22138
2021-08-16 09:21:07 -07:00
davidby-influx 73bdb2860e
chore: add logging to compaction (#21707)
Compaction logging will generate intermediate information on 
volume of data written and output files created, as well as 
improve some of the anti-entropy messages related to compaction.

This will also apply to `influx_tools compact`

Closes https://github.com/influxdata/influxdb/issues/21704
2021-06-16 15:28:44 -07:00
Ayan George c0e391935d
fix(tsdb): address staticcheck warning SA4006 (#18127)
This commit addresses staticcheck warning SA4006 "this value of VAR is
never used" by removing unused variables
2020-05-18 13:11:44 -04:00
Ben Johnson 3f59d4be8e fix(tsdb): Fix error lists 2020-04-01 14:54:39 -06:00
tmgordeeva f1d26652e9
fix(storage): skip TSM files with block read errors (#15885)
* fix(storage): skip TSM files with block read errors

When we find a bad TSM file during compaction, propagate the error up and move
the bad file aside. The engine will disregard the file so the next compaction
will not hit the same error.
2019-12-13 15:05:39 -08:00
Stuart Carnie 86734e7fcd
fix(storage): Don't panic when length of source slice is too large
StringArrayEncodeAll will panic if the total length of strings
contained in the src slice is > 0xffffffff. This change adds a unit
test to replicate the issue and an associated fix to return an error.

This also raises an issue that compactions will be unable to make
progress under the following condition:

* multiple string blocks are to be merged to a single block and
* the total length of all strings exceeds the maximum block size that
  snappy will encode (0xffffffff)

The observable effect of this is errors in the logs indicating a
compaction failure.

Fixes #13687
2019-06-04 17:05:01 -07:00
Ben Johnson 2dd913d71b
Revert "Limit force-full and cold compaction size."
This reverts commit 40db64d0b9.
2019-02-11 11:07:44 -07:00
Ben Johnson 40db64d0b9
Limit force-full and cold compaction size.
This commit limits the number of files that can be compacted in
a single group when forcing a full compaction or when a shard
becomes cold. This is to prevent too many files being compacted
at the same time.
2018-12-05 10:18:56 -07:00
Stuart Carnie 5d083887a5 chore: Compactor test which replicates issue #10465
Due to an encoding bug with simple8b, it is possible that the
MaxTime for a TSM index entry does not match the last encoded timestamp.
2018-11-14 09:13:13 -07:00
Jacob Marble 0dc5393441 tsm/cache: Remove unused function parameter 2018-06-13 15:22:37 -07:00
Jacob Marble bb313765e4 tsdb/tsm1: Clean up TSM filename format/parse 2018-05-29 09:57:48 -07:00
Jeff Wendling ce565965a4 tsdb: avoid nil checks on the observer
this avoids nil panics in the case that someone eventually forgets.
2018-05-23 13:15:41 -06:00
Jeff Wendling 1a8931af42
Merge pull request #9841 from influxdata/jmw-ensure-no-race-conditions
tsm1: ensure some race conditions are impossible
2018-05-16 11:56:10 -06:00
Jeff Wendling 7d2bb19b74 tsm1: ensure some race conditions are impossible
The InUse call on TSMFiles is inherently racy in the presence of
Ref calls outside of the file store mutex. In addition, we return
some TSMFiles to callers without them being Ref'd which might allow
them to be closed from underneath. While I believe it is the case
that it would be impossible, as the only thing that gets a handle
externally is compaction, and compaction enforces that only one
handle exists at a time, and thus is only deleted once after the
compaction is done with it, it's not very obvious or enforced.

Instead, always return a TSMFile with a Ref call under the read
lock, and require that no one else calls Ref. That way, it cannot
transition to referenced if the InUse call returns false under the
write lock.

The CreateSnapshot method was racy in a number of ways in the presence
of multiple calls or compactions: it did not take references to the
TSMFiles, and the temporary directory it creates could have been
shared with concurrent CreateSnapshot calls. In addition, the
files slice could have been concurrently mutated during a compaction
as well.

Instead, under the write lock, make a local copy of the state for
the compaction, including Ref calls (write locks are implicitly
read locks). Then, there is no need for a lock at all afterward.

Add some comments to explain these issues at the call sites of InUse,
and document that the Files method that returns the slice unprotected
is only for tests.
2018-05-14 19:45:42 -06:00
Ben Johnson 35a64dee99
Inject tsm file naming. 2018-05-14 10:46:38 -06:00
Stuart Carnie 0f6e6fb9ef
Merge pull request #9192 from influxdata/sgc-writer
ensure tsmWriter#Write returns ErrMaxBlocksExceeded
2018-02-01 15:39:01 -07:00
Hans Petter Bieker 7a273ccdb5 Fixed issue where compacting did not sort when block are unsorted and overlapping. 2018-01-04 15:25:26 +01:00
hpbieker ee185e18b7 Added unit test TestCompactor_Compact_UnsortedBlocks. 2018-01-03 09:42:36 +01:00
Jason Wilder 2d85ff1d09 Adjust compaction planning
Increase level 1 min criteria, fix only fast compactions getting run,
and fix very large generations getting included in optimize plans.
2017-12-14 22:41:34 -07:00
Jason Wilder 7dc5327a0a Adjust snapshot concurrency by latency
This changes the approach to adjusting the amount of concurrency
used for snapshotting to be based on the snapshot latency vs
cardinality.  The cardinality approach could use too much concurrency
and increase the number of level 1 TSM files too quickly which incurs
more disk IO.

The latency model seems to adjust better to different workloads.
2017-12-13 13:17:56 -07:00
Jason Wilder 0a85ce2b73 Schedule compactions less aggressively
This runs the scheduler every 5s instead of every 1s as well as reduces
the scope of a level 1 plan.
2017-12-06 13:45:43 -07:00
Stuart Carnie fffd123646 update unit test 2017-12-01 12:19:28 -07:00
Jason Wilder b6096414c2 Fix compactions aborting early
If there were many individual deletes to a series that ended up
deleting every value in the block and the tombstone timestamps
were not contigous, it was possible for the TSMKeyIterator to
return false for Next incorrectly.  This causes the compaction to
drop any remaining data in the file.

Normally, if all the data is deleted via tombstones, we remove the
whole key from the TSM index.  In this case, we're not able to determine
that the key is fully deleted until the block is decode and tombstones
are applied.

This changes the TSMKeyIterator to detect this condition and continue
to the next key instead of aborting.
2017-11-30 14:38:09 -07:00
Jason Wilder 97e0d496a6 Add capability to force a full compaction
This adds the capability to the engine to force a full compaction
to be scheduled.  When called, it snapshots any data in the cache,
aborts running compactions and prevents level plans from returning
level plans.
2017-11-15 07:14:27 -07:00
Jason Wilder 17bae05370 Allow buffering tombstones before writing to disk 2017-11-13 08:48:03 -07:00
Jason Wilder 06226d6fd3 Handle orphan lower level TSM files during full planning
Some files seem to get orphan behind higher levels.  This causes
the compactions to get blocked as the lowere level files will not
get picked up by their lower level planners.  This allows the full
plan to identify them and pull them into their plans.
2017-10-04 08:13:14 -06:00
Jason Wilder 0c0505881f Remove multiple file skipping for full compaction planning
This check doesn't make sense for high cardinality data as the files
typically get big and sparse very quickly.  This causes a lot of extra
disk space to be used which is taken up by large indexes and sparse
data.
2017-10-03 10:48:14 -06:00
Jason Wilder 4ff4ba0841 Use first file in generation for level
With higher cardinality or larger series keys, the files can roll
over early which causes them to take longer to be compacted by higher
levels.  This causes larger disk usage and higher numbers of tsm files
at times.
2017-10-03 10:48:14 -06:00
Jason Wilder 1610ae5727 Don't return tsm files part of a compaction plan 2017-10-03 10:48:13 -06:00
Jason Wilder ddeba2c86b Split large snapshots and write concurrently 2017-09-19 15:27:25 -06:00
Jason Wilder 7388eb9499 Use disk when writing TSM index 2017-09-11 15:29:25 -06:00
Jason Wilder d3e832b462 Use offheap memory for indirect index offsets slice 2017-09-11 15:29:25 -06:00
Jason Wilder 91eb9de341 Use existing TSMReader from file store during compactions
Compactions would create their own TSMReaders for simplicity. With
very high cardinality compactions, creating the reader and indirectIndex
can start to use a significant amount of memory.

This changes the compactions to use a reader that is already allocated
and managed by the FileStore.
2017-09-11 15:29:25 -06:00
Jason Wilder 778000435a Conver all keys from string to []byte in TSM engine
This switches all the interfaces that take string series key to
take a []byte.  This eliminates many small allocations where we
convert between to two repeatedly.  Eventually, this change should
propogate futher up the stack.
2017-07-28 11:00:50 -06:00
Jason Wilder 18a02d50d7 Interrupt in progress TSM compactions
When snapshots and compactions are disabled, the check to see if
the compaction should be aborted occurs in between writing to the
next TSM file.  If a large compaction is running, it might take
a while for the file to be finished writing causing long delays.

This now interrupts compactions while iterating over the blocks to
write which allows them to abort immediately.
2017-07-27 15:58:56 -06:00
Jason Wilder 4d002bb370 Limit concurrent compactions within a shard
This changes full compactions within a shard to run sequentially
instead of running all the compaction groups in parallel.  Normally,
there is only 1 full compaction group to run.  At times, there could
be several which causes instability if they are all running concurrently
as they tie up a cpu for long periods of time.

Level compactions are also capped to a max of 4 concurrently running for each level
in a shard.  This prevents sudden spikes in CPU and disk usage due to a large backlog
of tsm files at a given level.
2017-05-12 14:05:24 -06:00
Jason Wilder 3d1c0cd981 Don't return compaction plans for files already part of a plan
The compactor prevents the same file from being compacted by different
compaction runs, but it can result in warning errors in the logs that
are confusing.

This adds compaction plan tracking to the planner so that files are
only part of one plan at a given time.
2017-05-03 16:31:57 -06:00
Jason Wilder 675d7c9d65 Merge branch '1.2' into jw-merge12 2017-03-06 11:09:05 -07:00
Jason Wilder eab012ef61 Fix points missing after compaction
If blocks containing overlapping ranges of time where partially
recombined, it was possible for the some points to get dropped
during compactions.  This occurred because the window of time of
the points we need to merge did not account for the partial blocks
created from a prior merge.

Fixes #8084
2017-03-06 10:17:11 -07:00
Edd Robinson 292b30b82b Fix subtle bugs and remove dead code from tsdb 2017-01-17 09:47:34 -08:00