Commit Graph

14901 Commits (8c908c547b9812e31ab8cf3fa5923ef8ede27631)

Author SHA1 Message Date
devanbenz 8c908c547b feat: Reverse block counts 2025-01-14 12:36:14 -06:00
devanbenz 0a2ba1ec3f feat: Use t.Run instead of declaring the test name in the requires 2025-01-14 09:18:26 -06:00
devanbenz 976291aa6b feat: Add test names to the testing struct 2025-01-13 14:23:05 -06:00
devanbenz 3e69f2d002 feat: cleanup 2025-01-13 14:02:33 -06:00
devanbenz 0799f00cea feat: create loop for tests where there should be no further compaction 2025-01-13 14:01:30 -06:00
devanbenz 3748c364a0 feat: Begin 'compacting' tests in to single test 2025-01-13 13:41:44 -06:00
devanbenz 5a614c4912 feat: Adding more tests reversing and mixing up some of the
file sizes and block counts
2025-01-09 15:52:36 -06:00
Devan 371f960664 feat: Fix a merge conflict where a var was renamed from fs -> fss 2025-01-06 15:57:20 -06:00
WeblWabl 1bac192bc8
Merge branch 'master-1.x' into db/4201/compaction-bugs 2025-01-06 12:08:09 -06:00
davidby-influx e974165d25
fix: do not leak file handles from Compactor.write (#25725)
There are a number of code paths in Compactor.write which
on error can lead to leaked file handles to temporary files.
This, in turn, prevents the removal of the temporary files until
InfluxDB is rebooted, releasing the file handles.

closes https://github.com/influxdata/influxdb/issues/25724
2025-01-03 14:43:41 -08:00
davidby-influx 694607a22c
fix: avoid panic if shard group has no shards (#25717) (#25719)
Avoid panicking when mapping points to a shard group
that has no shards. This does not address the root problem,
how the shard group ended up with no shards.

helps: https://github.com/influxdata/influxdb/issues/25715
(cherry picked from commit 5b364b51c8)

closes: https://github.com/influxdata/influxdb/issues/25718
2024-12-27 14:30:01 -08:00
Devan eb0a77dd35 feat: Add a mock backfill test with mixed generations, mixed levels, and mixed block counts 2024-12-26 14:27:49 -06:00
Devan c315b1f2d2 chore: rerun ci 2024-12-26 13:34:03 -06:00
Devan 5e4e2da881 feat: Adds test for planning lower level TSMs with block sizes at aggressive block count 2024-12-26 12:23:18 -06:00
Devan f444518abb feat: Adds check for block counts and adjusts tests to use require.Zero() 2024-12-26 12:04:42 -06:00
Devan c392906969 feat: Remove some overlapping tests
Add a check to ensure that "orphaned" levels are compacted
further with the rest of the shard.
2024-12-20 15:12:52 -06:00
Devan 479de96f9b feat: Add test for another edge case found;
Many tsm generations over level 4 compaction
single tsm generation under level 4 compaction all in
same shard. Group size is over 2 GB for each generation.
2024-12-19 19:03:23 -06:00
Devan 2dd5ef40ed feat: missed a test when updating the variable! whoops! 2024-12-19 12:41:33 -06:00
Devan c93bdfbc55 feat: grammar typo 2024-12-19 12:16:48 -06:00
Devan 4fc4d5546d feat: clarify file counts for reason we are not fully compacted 2024-12-19 12:15:49 -06:00
Devan fc6ca13ea7 feat: Call SingleGenerationReason() once by initializing a
var called SingleGenerationReasonText var
2024-12-19 12:04:45 -06:00
Devan cf657a8cad feat: touch 2024-12-18 14:51:04 -06:00
Devan d3afb030bd feat: fix typo 2024-12-18 14:43:35 -06:00
Devan 23d12e1046 feat: Fix up some tests that I forgot to adjust 2024-12-18 14:34:55 -06:00
Devan 403d888020 feat: Adjust tests to include lower level planning function calls 2024-12-18 14:25:06 -06:00
Devan 54c8e1c446 feat: touch 2024-12-18 09:56:00 -06:00
Devan f15d9be415 feat: need to use int64 instead of int 2024-12-18 09:49:03 -06:00
Devan f896a01ec0 feat: Adjust testing and add sprintf for magic vars 2024-12-18 09:41:18 -06:00
Devan 83d28ec079 feat: setting BlockCount idx value to 1 2024-12-17 15:29:05 -06:00
Devan 31535963e1 feat: code removal from debugging 2024-12-17 15:18:39 -06:00
Devan 5387ca3837 feat: adjust test comments 2024-12-17 15:16:46 -06:00
Devan 827e85962d feat: Use named variables for PlanOptimize 2024-12-17 15:09:22 -06:00
Devan 67849aee72 feat: Modify the PR to include optimized compaction
for shards that may have over a 2 GB group size but
many fragmented files (under 2 GB and under aggressive
point per block count)
2024-12-17 15:03:32 -06:00
cpinflux db523227a2
feat: Added fluxQueryRespBytes metric to 1.x /debug/vars (#25669)
This PR adds an additional statistic "fluxQueryRespBytes" to the output of /debug/vars, in turn making it available to Telegraf and other monitoring tools.

Closes https://github.com/influxdata/influxdb/issues/25671
2024-12-17 11:35:45 -08:00
Devan d631314385 feat: Modify optimized compaction to cover edge cases
This PR changes the algorithm for compaction to account for the following
cases that were not previously accounted for:

- Many generations with a groupsize over 2 GB
- Single generation with many files and a groupsize under 2 GB
Where groupsize is the total size of the TSM files in said shard directory.

closes https://github.com/influxdata/influxdb/issues/25666
2024-12-16 17:30:35 -06:00
WeblWabl 45a8227ad6
fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578) (#25622) (#25624)
* fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578)

* fix(influxd): update xxhash, avoid stringtoslicebyte in cache

This commit does 3 things:

* it updates xxhash from v1 to v2; v2 includes a assembly arm version of
  Sum64
* it changes the cache storer to write with a string key instead of a
  byte slice. The cache only reads the key which WriteMulti already has
as a string so we can avoid a host of allocations when converting back
and forth from immutable strings to mutable byte slices. This includes
updating the cache ring and ring partition to write with a string key
* it updates the xxhash for finding the cache ring partition to use
Sum64String which uses unsafe pointers to directly use a string as a
byte slice since it only reads the string. Note: this now uses an
assembly version because of the v2 xxhash update. Go 1.22 included new
compiler ability to recognize calls of Method([]byte(myString)) and not
make a copy but from looking at the call sites, I'm not sure the
compiler would recognize it as the conversion to a byte slice was
happening several calls earlier.

That's what this change set does. If we are uncomfortable with any of
these, we can do fewer of them (for example, not upgrade xxhash; and/or
not use the specialized Sum64String, etc).

For the performance issue in maz-rr, I see converting string keys to
byte slices taking between 3-5% of cpu usage on both the primary and
secondary. So while this pr doesn't address directly the increased cpu
usage on the secondary, it makes cpu usage less on both which still
feels like a win. I believe these changes are easier to review that
switching to a byte slice pool that is likely needed in other places as
the compiler provides nearly all of the correctness checks we need (we
are relying also on xxhash v2 being correct).

* helps #550

* chore: fix tests/lint

* chore: don't use assembly version; should inline

This 2 line change causes xxhash to use a purego Sum64 implementation
which allows the compiler to see that Sum64 only read the byte slice
input which them means is can skip the string to byte slice allocation
and since it can skip that, it should inline all the calls to
getPartitionStringKey and Sum64 avoiding 1 call to Sum64String which
isn't inlined.

* chore: update ci build file

the ci build doesn't use the make file!!!

* chore: revert "chore: update ci build file"

This reverts commit 94be66fde03e0bbe18004aab25c0e19051406de2.

* chore: revert "chore: don't use assembly version; should inline"

This reverts commit 67d8d06c02e17e91ba643a2991e30a49308a5283.

(cherry picked from commit 1d334c679ca025645ed93518b7832ae676499cd2)

* feat: need to update go sum

---------

Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
(cherry picked from commit 06ab224516)
2024-12-06 16:05:03 -06:00
davidby-influx eea87ba94c
fix: log rejected writes to subscriptions (#25589)
Log writes to subscriptions that are rejected because
the queue is full by bytes or by length metrics.
2024-11-25 16:11:04 -08:00
WeblWabl 75eb209f72
feat(influx_inspect): Adds an additional log to rebuild TSI (#25575)
Closes https://github.com/influxdata/feature-requests/issues/612
2024-11-21 15:28:27 -06:00
davidby-influx 19f65f50b7
fix: optimise write window check (#25558)
And expose types and methods for Enterprise use.
2024-11-15 14:41:30 -08:00
davidby-influx 07c261a21a
feat: allow the specification of a write window for retention policies (#25517)
Add FutureWriteLimit and PastWriteLimit to retention
policies. Points which are outside of
now() + FutureWriteLimit
or
now() - PastWriteLimit
will be rejected on write with a PartialWriteError.

closes https://github.com/influxdata/influxdb/issues/25424
2024-11-15 13:30:14 -08:00
davidby-influx d2f874b411
feat: improve logging for subscriptions
Print the subscription name, destination,
retention policy, and database on errors in subscription writes

closes https://github.com/influxdata/influxdb/issues/25518
2024-11-14 15:47:07 -08:00
Geoffrey Wossum 8497fbf0af
chore: remove unnecessary fmt.Sprintf calls (#25536)
Remove unnecessary fmt.Sprintf calls for static code checks in main-2.x.
2024-11-12 11:06:39 -06:00
Geoffrey Wossum 65683bf166
chore: fix logging issues in Store.loadShards (#25529)
Fix reporting shards not opening correctly when they actually did.
Fix race condition with logging in loadShards.
2024-11-12 09:34:05 -06:00
Geoffrey Wossum 0bc167bbd7
chore: loadShards changes to more cleanly support 2.x feature (#25513)
* chore: move shardID parsing and shard filtering into walkShardsAndProcess

* chore: make it impossible to miss sending shardResponse or marking shard as complete

* chore: always count number of shards (preparation for 2.x related feature)

* chore: explicitly load series files and create indices serially

Explicitly load series files and create indices serially. Also
avoid passing them to work functions that don't need them.

* chore: rework loadShards for changes necessary to cancel loading process

* chore: comment improvements

* fix: fix race conditions in TestStore_StartupShardProgress and TestStore_BadShardLoading

* chore: avoid logging nil error

* chore: refactor shard loading and shard walking

Refactor loadShards and CreateShard to use a common shardLoader class that
makes thread-safety easier. Refactor walkShardsAndProcess into findShards.

* chore: improve comment

* chore: rename OpenShard to ReopenShard and implement with shardLoader

Rename Store.OpenShard to Store.ReopenShard and implement using a
shardLoader object. Changes to tests as necessary.

* chore: avoid resetting shard options and locking on Reopen

Avoid resetting shard options when reopening a shard.
Proper mutex locker in Shard.ReopenShard.

* chore: fix formatting issue

* chore: warn on mixed index types in Store.CreateShard

* chore: change from info to warn when invalid shard IDs found in path

* chore: use coarser locking in Store.ReopenShard

* chore: fix typo in comment

* chore: code simplification
2024-11-08 15:49:48 -06:00
WeblWabl 2cab9a2a1f
feat: Adds functionality to clear out bad shard list (#25398)
* feat(tsdb): Adds functionality to clear bad shards list

This PR adds test and new method to clear out the bad shards list
the method will return the values of the shards that it cleared out
along with the errors. This is the first part in the feature
for adding a load-shards command to influxd-ctl.

Closes influxdata/feature-requests#591
2024-10-18 13:22:32 -05:00
Geoffrey Wossum 86e81167b8
feat: allow `influx -import` to import from stdin (#25472)
Allow `influx -import` to import from stdin by specifying `-` as the path.
Example: `influx -import -path -`
2024-10-17 14:33:56 -05:00
WeblWabl 3c87f524ed
feat(logging): Add startup logging for shard counts (#25378)
* feat(tsdb): Adds shard opening progress checks to startup
This PR adds a check to see how many shards are remaining
vs how many shards are opened. This change displays the percent
completed too.

closes influxdata/feature-requests#476
2024-10-16 10:09:15 -05:00
Shiwen Cheng 860a74f8a5
fix(backup): fix DBRetentionAndShardFromPath parsing error between-different-os (#25362)
Split paths by both forward and back slashes.

Closes https://github.com/influxdata/influxdb/issues/25361
2024-09-27 16:47:33 -07:00
Shiwen Cheng 1bc0eb4795
fix(tsm1): Fix data race of seriesKeys in deleteSeriesRange (#25268)
Add an RWMutex to allow safe concurrent 
access in deleteSeriesRange
2024-09-27 16:36:27 -07:00
Shiwen Cheng 8419439e89
test: fix MeasurementNamesFn missing retentionPolicy parameter (#25275)
Add retention policy to the test function for measurement names,
instead of ignoring the argument passed in.
2024-09-27 15:58:47 -07:00