* feat: Use ldext in build-test.sh
* feat: use env vars
* feat: re-add ./... to go-test-compile
* chore: update cross-builder tag
* chore: change cross builder tag to deb10f07f31767ee55b0b5f87edd34635673cf41
* chore: use latest instead of tag
* feat: run ci
* feat: typo in build-tests.sh
(cherry picked from commit cb5072ce33)
* feat: update crossbuilder and ldflags for go
* feat: Update build_windows for build-tests script
this includes a CGO_LDFLAGS env var when running go-test-compile
(cherry picked from commit 4689f21687)
* feat: update protos
This PR updates the proto files to use protoc-gen-go v1.34.1
and protoc to use v5.29.2
* feat: update makefile target to use v1.34.1 of protoc-gen-go
(cherry picked from commit 26170d4e57)
* chore: loadShards changes to more cleanly support 2.x feature (#25513)
* chore: move shardID parsing and shard filtering into walkShardsAndProcess
* chore: make it impossible to miss sending shardResponse or marking shard as complete
* chore: always count number of shards (preparation for 2.x related feature)
* chore: explicitly load series files and create indices serially
Explicitly load series files and create indices serially. Also
avoid passing them to work functions that don't need them.
* chore: rework loadShards for changes necessary to cancel loading process
* chore: comment improvements
* fix: fix race conditions in TestStore_StartupShardProgress and TestStore_BadShardLoading
* chore: avoid logging nil error
* chore: refactor shard loading and shard walking
Refactor loadShards and CreateShard to use a common shardLoader class that
makes thread-safety easier. Refactor walkShardsAndProcess into findShards.
* chore: improve comment
* chore: rename OpenShard to ReopenShard and implement with shardLoader
Rename Store.OpenShard to Store.ReopenShard and implement using a
shardLoader object. Changes to tests as necessary.
* chore: avoid resetting shard options and locking on Reopen
Avoid resetting shard options when reopening a shard.
Proper mutex locker in Shard.ReopenShard.
* chore: fix formatting issue
* chore: warn on mixed index types in Store.CreateShard
* chore: change from info to warn when invalid shard IDs found in path
* chore: use coarser locking in Store.ReopenShard
* chore: fix typo in comment
* chore: code simplification
(cherry picked from commit 0bc167bbd7)
* chore: fix logging issues in Store.loadShards
Fix reporting shards not opening correctly when they actually did.
Fix race condition with logging in loadShards.
(cherry picked from commit 65683bf166)
* chore: remove unnecessary fmt.Sprintf calls
Remove unnecessary fmt.Sprintf calls for static code checks in main-2.x.
(cherry picked from commit 8497fbf0af)
* chore: remove unnecessary blank identifier
* chore: remove unnecessary blank identifier
(cherry picked from commit 5c7479eb14)
Closes: #25555
Add `--pid-file` option to write PID files on startup. The PID filename
is specified by the argument after `--pid-file`. If the PID file already exists, influxd will exit unless the `--overwrite-pid-file` flag is also used.
Example: `influxd --pid-file /var/lib/influxd/influxd.pid`
PID files are automatically removed when the influxd process is shutdown.
Closes: #25498
(cherry picked from commit c35321b470)
(cherry picked from commit 48f760065b)
Add `--storage-wal-flush-on-shutdown` to flush WAL on database shutdown.
On successful shutdown, all WAL data will be committed to TSM files and the
WAL directories will not contain any .wal files.
Clean cherry-pick of #25444 from main-2.x.
Closes: #25422
(cherry picked from commit 96bade409e)
If NewTSMReader() fails because mmap fails, do not
rename the file, because the error is probably
caused by vm.max_map_count being too low
Closes: #25351
(cherry picked from commit 5aff511e40)
* fix(tsi1/partition/test): fix data races in test code (#57)
* fix(tsi1/partition/test): fix data races in test code
This PR is like #24613 but solves it with a setter
method for MaxLogFileSize which allows unexporting that value and
MaxLogFileAge. There are actually two places locks were needed in test
code. The behavior of production code is unchanged.
(cherry picked from commit f0235c4daf4b97769db932f7346c1d3aecf57f8f)
* feat: modify error handling to be more idiomatic
closes#24042
* fix: errors.Join() filters nil errors
closes#25341
---------
Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
(cherry picked from commit 5c9e45f)
(cherry picked from commit b88e74e)
closes#25342
* fix: Update UI to 2.7.9
Pulls in UI 2.7.9 which is based off of the releases/OSS branch, not master. Includes only the one commit to update the templates.
* chore(ui): Update fetch-ui-assets.sh
Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.
The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.
If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.
This is a port of ad68ec8 from master-1.x to main-2.x.
closes: #25118
(cherry picked from commit b4bd607eef)
(cherry picked from commit cb8cfe3510)
Stacks and templates allow specifying file:// URLs. Add command line
option `--template-file-urls-disabled` to disable their use for people who don't require them.
Closes: #25068
(cherry picked from commit 9fd91a554d)
* feat: update flux to latest head (#25051)
* feat: update flux to latest head
Flux has updated some dependencies, including prometheus. Prometheus
has changed in some incompatible ways. Update the flux dependency
to a newer version with the updated prometheus dependency and apply
some small fixes to make everything build. This is in preparation
for a flux release later in the week.
The biggest change is in some tests that were using runtime.DeepEqual
to check the correctness of prometheus metrics. The internals of
these types have changed such that this is not a safe thing to do
anymore. The test now verifies the string representations, as
produced by String(), match.
* fix: update CI script
The scripts/ci/check-system-go-matches-go-mod.sh is failing because
newer go toolchains include the bugfix version in go.mod's go
directive. Update the script to check the major and minor versions
reported by both tools match.
(cherry picked from commit fd0531761c)
* build(flux): update flux to v0.195.1 (#25052)
(cherry picked from commit f4ef091f50)
* fix: update broken flux and perf tests (main-2.x) (#24617)
* chore: download repository key to file
* fix: broken perf tests
Some perf tests had to be temporarily disabled. Work is
needed in the pref_tests repositories to make them work
again.
* fix(tsi1/partition/test): fix data race in test code (#24613)
* fix(tsi1/partition/test): fix data race in test code
TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.
* fixes#24042, for #24040
* chore: complete renaming of mutex in file and fix flux test
The flux test is another failing test because it was using a relative
time range.
---------
Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
Under certain circumstances, the retention service can fail to delete shards from
the store in a timely manner. When the shard groups are pruned based on age, this
leaves orphaned shard files on the disk. The retention service will then not attempt
to remove the obsolete shard files because the meta store does not know about them.
This can cause excessive disk space usage for some users.
This corrects that by requiring shards files be deleted before they can be removed
from the meta store.
fixes: #24529
(cherry picked from commit 7bd3f89d18)
closes https://github.com/influxdata/influxdb/issues/24545
Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
(cherry picked from commit 0dc48b1260)
closes https://github.com/influxdata/influxdb/issues/24546
HTTP 5XX errors were being returned incorrectly from
BoltDB errors that were actually bad requests, e.g.,
names that were too long for buckets, users, and
organizations. Map BoltDB errors to correct Influx
errors and return 4XX errors where appropriate. Also
add op codes to more errors
(cherry picked from commit a3fd489864)