Commit Graph

596 Commits (db/update-protos)

Author SHA1 Message Date
Devan 6d3ec2e15d feat: update protos
This PR updates the proto files to use protoc-gen-go v1.34.1
and protoc to use v5.29.2
2025-02-24 09:51:41 -06:00
davidby-influx 8711e2d6cc
fix: prevent differing field types in the same shard (#26025)
* fix: lock MeasurementFields while validating (#25998)

There was a window where a race between writes with
differing types for the same field were being validated.
Lock the  MeasurementFields struct during field
validation to avoid this.

closes https://github.com/influxdata/influxdb/issues/23756

(cherry picked from commit 5a20a835a5)

helps https://github.com/influxdata/influxdb/issues/26001

* fix: switch MeasurementFields from atomic.Value to sync.Map (#26022)

Simplify and speed up synchronization for
MeasurementFields structures by switching
from a mutex and atomic.Value to a sync.Map

(cherry picked from commit b617eb24a7)

closes https://github.com/influxdata/influxdb/issues/26001
2025-02-14 12:28:10 -08:00
WeblWabl 982ae57f22
feat: Add error join for file writing in snapshots (#26004) (#26005)
This PR adds an error join to help with handling multiple errors
from snapshot file writers.

(cherry picked from commit 4ad5e2aba7)
2025-02-12 16:12:36 -06:00
WeblWabl 5149774e22
feat: Add error joins/returns (#25996) (#26000)
This pr adds err handling for branch that did not specify os file removal errors
previously. This is part of EAR #5819.

(cherry picked from commit 306a184a8d)
2025-02-12 09:56:20 -06:00
WeblWabl 73722a5b66
feat: upgrade go to 1.23.5 (#25925)
* feat: This PR updates the go toolchain from 1.22.11 to 1.23.5
2025-01-28 12:58:28 -06:00
davidby-influx dd7b4ce351
fix: move aside TSM file on errBlockRead (#25899)
The error type check for errBlockRead was incorrect,
and bad TSM files were not being moved aside when
that error was encountered. Use errors.Join,
errors.Is, and errors.As to correctly unwrap multiple
errors.

Closes https://github.com/influxdata/influxdb/issues/25838

(cherry picked from commit 800970490a)

Closes https://github.com/influxdata/influxdb/issues/25840
2025-01-22 14:10:14 -08:00
davidby-influx c82d4f86ee
fix: do not leak file handles from Compactor.write (#25725) (#25740)
There are a number of code paths in Compactor.write which
on error can lead to leaked file handles to temporary files.
This, in turn, prevents the removal of the temporary files until
InfluxDB is rebooted, releasing the file handles.

closes https://github.com/influxdata/influxdb/issues/25724

(cherry picked from commit e974165d25)

closes https://github.com/influxdata/influxdb/issues/25739
2025-01-06 09:03:37 -08:00
WeblWabl 06ab224516
fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578) (#25622)
* fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578)

* fix(influxd): update xxhash, avoid stringtoslicebyte in cache

This commit does 3 things:

* it updates xxhash from v1 to v2; v2 includes a assembly arm version of
  Sum64
* it changes the cache storer to write with a string key instead of a
  byte slice. The cache only reads the key which WriteMulti already has
as a string so we can avoid a host of allocations when converting back
and forth from immutable strings to mutable byte slices. This includes
updating the cache ring and ring partition to write with a string key
* it updates the xxhash for finding the cache ring partition to use
Sum64String which uses unsafe pointers to directly use a string as a
byte slice since it only reads the string. Note: this now uses an
assembly version because of the v2 xxhash update. Go 1.22 included new
compiler ability to recognize calls of Method([]byte(myString)) and not
make a copy but from looking at the call sites, I'm not sure the
compiler would recognize it as the conversion to a byte slice was
happening several calls earlier.

That's what this change set does. If we are uncomfortable with any of
these, we can do fewer of them (for example, not upgrade xxhash; and/or
not use the specialized Sum64String, etc).

For the performance issue in maz-rr, I see converting string keys to
byte slices taking between 3-5% of cpu usage on both the primary and
secondary. So while this pr doesn't address directly the increased cpu
usage on the secondary, it makes cpu usage less on both which still
feels like a win. I believe these changes are easier to review that
switching to a byte slice pool that is likely needed in other places as
the compiler provides nearly all of the correctness checks we need (we
are relying also on xxhash v2 being correct).

* helps #550

* chore: fix tests/lint

* chore: don't use assembly version; should inline

This 2 line change causes xxhash to use a purego Sum64 implementation
which allows the compiler to see that Sum64 only read the byte slice
input which them means is can skip the string to byte slice allocation
and since it can skip that, it should inline all the calls to
getPartitionStringKey and Sum64 avoiding 1 call to Sum64String which
isn't inlined.

* chore: update ci build file

the ci build doesn't use the make file!!!

* chore: revert "chore: update ci build file"

This reverts commit 94be66fde03e0bbe18004aab25c0e19051406de2.

* chore: revert "chore: don't use assembly version; should inline"

This reverts commit 67d8d06c02e17e91ba643a2991e30a49308a5283.

(cherry picked from commit 1d334c679ca025645ed93518b7832ae676499cd2)

* feat: need to update go sum

---------

Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
2024-12-05 16:57:26 -06:00
Geoffrey Wossum 037c6af6e8
feat: check for uncommitted WRR segments during startup (#25540)
Check for uncommitted WRR segments during startup and abort startup
if found.

Closes: #25503
2024-11-14 15:27:01 -06:00
Geoffrey Wossum 5c7479eb14
chore: loadShards changes to more cleanly support 2.x feature (#25528)
* chore: loadShards changes to more cleanly support 2.x feature (#25513)

* chore: move shardID parsing and shard filtering into walkShardsAndProcess

* chore: make it impossible to miss sending shardResponse or marking shard as complete

* chore: always count number of shards (preparation for 2.x related feature)

* chore: explicitly load series files and create indices serially

Explicitly load series files and create indices serially. Also
avoid passing them to work functions that don't need them.

* chore: rework loadShards for changes necessary to cancel loading process

* chore: comment improvements

* fix: fix race conditions in TestStore_StartupShardProgress and TestStore_BadShardLoading

* chore: avoid logging nil error

* chore: refactor shard loading and shard walking

Refactor loadShards and CreateShard to use a common shardLoader class that
makes thread-safety easier. Refactor walkShardsAndProcess into findShards.

* chore: improve comment

* chore: rename OpenShard to ReopenShard and implement with shardLoader

Rename Store.OpenShard to Store.ReopenShard and implement using a
shardLoader object. Changes to tests as necessary.

* chore: avoid resetting shard options and locking on Reopen

Avoid resetting shard options when reopening a shard.
Proper mutex locker in Shard.ReopenShard.

* chore: fix formatting issue

* chore: warn on mixed index types in Store.CreateShard

* chore: change from info to warn when invalid shard IDs found in path

* chore: use coarser locking in Store.ReopenShard

* chore: fix typo in comment

* chore: code simplification

(cherry picked from commit 0bc167bbd7)

* chore: fix logging issues in Store.loadShards

Fix reporting shards not opening correctly when they actually did.
Fix race condition with logging in loadShards.

(cherry picked from commit 65683bf166)

* chore: remove unnecessary fmt.Sprintf calls

Remove unnecessary fmt.Sprintf calls for static code checks in main-2.x.

(cherry picked from commit 8497fbf0af)

* chore: remove unnecessary blank identifier

* chore: remove unnecessary blank identifier
2024-11-12 14:12:53 -06:00
WeblWabl 2ffb108a27
feat(logging): Add startup logging for shard counts (#25378) (#25507)
* feat(logging): Add startup logging for shard counts (#25378)
This PR adds a check to see how many shards are remaining
vs how many shards are opened. This change displays the percent
completed too.

closes influxdata/feature-requests#476

(cherry picked from commit 3c87f52)

closes https://github.com/influxdata/influxdb/issues/25506
2024-11-01 09:20:35 -05:00
Geoffrey Wossum 96bade409e
feat: add option to flush WAL on shutdown (#25444)
* feat: add option to flush WAL on shutdown

Add `--storage-wal-flush-on-shutdown` to flush WAL on database shutdown.
On successful shutdown, all WAL data will be committed to TSM files and the
WAL directories will not contain any .wal files.

Closes: #25422
2024-10-10 15:27:54 -05:00
Geoffrey Wossum 60e49d854c
chore: replace uses of %v with %w (#25358)
Replace uses of `%v` with `%w` where appropriate in file_store.go

Closes: #25357
2024-09-25 15:12:31 -05:00
WeblWabl b88e74e6bb
fix(tsi1/partition/test): fix data races in test code (#57) (#25338) (#25344)
* fix(tsi1/partition/test): fix data races in test code (#57)

* fix(tsi1/partition/test): fix data races in test code

This PR is like influxdata/influxdb#24613 but solves it with a setter
method for MaxLogFileSize which allows unexporting that value and
MaxLogFileAge. There are actually two places locks were needed in test
code. The behavior of production code is unchanged.

(cherry picked from commit f0235c4daf4b97769db932f7346c1d3aecf57f8f)

* feat: modify error handling to be more idiomatic

closes https://github.com/influxdata/influxdb/issues/24042

* fix: errors.Join() filters nil errors

closes https://github.com/influxdata/influxdb/issues/25341
---------

Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
(cherry picked from commit 5c9e45f033)
2024-09-17 13:09:14 -05:00
Geoffrey Wossum 5aff511e40
fix: do not rename files on mmap failure (#25340)
If NewTSMReader() fails because mmap fails, do not
rename the file, because the error is probably
caused by vm.max_map_count being too low

Closes: #25337

(cherry picked from commit ec412f793b)
2024-09-17 12:48:21 -05:00
WeblWabl 5a599383f1
fix(tsi1/partition/test): fix data races in test code (#57) (#25336)
* fix(tsi1/partition/test): fix data races in test code

This PR is like #24613 but solves it with a setter
method for MaxLogFileSize which allows unexporting that value and
MaxLogFileAge. There are actually two places locks were needed in test
code. The behavior of production code is unchanged.

(cherry picked from commit f0235c4daf4b97769db932f7346c1d3aecf57f8f)
2024-09-16 16:51:00 -05:00
Geoffrey Wossum da9615fdc3
chore: improve error messages and logging during shard opening (#25331)
Ported from master-1.x.

(cherry picked from commit 23008e5286)

Closes: #25328
2024-09-13 16:59:17 -05:00
davidby-influx 96c97a76f4
fix: add additional logging on loading fields.idxl files (#25309) (#25319)
Log the path of the file being loaded, and when level=debug
log progress fpr each set of field changes

closes https://github.com/influxdata/influxdb/issues/25289

(cherry picked from commit 5d8d1120e1)

closes https://github.com/influxdata/influxdb/issues/25311
2024-09-12 13:46:21 -07:00
davidby-influx 031f394d2c
fix: prevent an infinite loop in measurementFieldSetChangeMgr (#25155) (#25156)
The measurementFieldSetChangeMgr has a possibly infinite loop
if the writeRequests channel is closed while in the inner
loop to consolidate write requests. We need to check for ok
on channel receive and exit the loop when ok is false.

closes https://github.com/influxdata/influxdb/issues/25151

(cherry picked from commit 176fca2138)

closes https://github.com/influxdata/influxdb/issues/25153
2024-07-12 20:33:59 -07:00
Geoffrey Wossum cb8cfe3510
fix: prevent retention service from hanging (#25077)
* fix: prevent retention service from hanging (#25055)

Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.

The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.

If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.

This is a port of ad68ec8 from master-1.x to main-2.x.

closes: #25076
(cherry picked from commit b4bd607eef)
2024-06-24 12:27:22 -05:00
davidby-influx 0a4d41bc90
fix: ensure TSMBatchKeyIterator and FileStore close all TSMReaders (#24957) (#24964)
Do not let errors on closing
a TSMReader prevent other
closes.

(cherry picked from commit 82cbdb5478)

closes https://github.com/influxdata/influxdb/issues/24961
2024-05-06 10:45:41 -07:00
davidby-influx 73f694ac3c
chore: update google.golang.org/protobuf to 1.33.0 (#24940)
* chore: update google.golang.org/protobuf to 1.33.0

closes https://github.com/influxdata/edge/issues/627

* chore: update protoc
2024-05-01 10:16:23 -04:00
davidby-influx 49d0bef3ea
fix: return and respect cursor errors (#24791) (#24846)
ArrayCursors were ignoring errors, which led to panics when nil
cursors were operated on. This fix passes errors back up the stack
and uses them to enforce healthy cursor creation.

Closes https://github.com/influxdata/influxdb/issues/24789
---------
Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>

(cherry picked from commit fe6c64b21e)

closes https://github.com/influxdata/influxdb/issues/24836
2024-03-26 14:54:32 -07:00
davidby-influx 2066c4be46
fix: improved shard deletion (#24602) (#24844)
Avoid unnecessarily deleting series from the series file
Log all errors on shard deletion

Closes https://github.com/influxdata/influxdb/issues/24834

(cherry picked from commit 8ff06d5a92)

closes https://github.com/influxdata/influxdb/issues/24836
2024-03-26 14:18:08 -07:00
davidby-influx 82dc3430b8
fix: do not panic when empty tags are queried (#24784) (#24786)
Do not panic if a cursor array is nil and the number
of timestamps is retrieved.

closes https://github.com/influxdata/influxdb/issues/24536

(cherry picked from commit bc80e881fa)
2024-03-18 22:03:27 -07:00
Phil Bracikowski 5d801119c5
feat(tsm1/wal): encapsulate expiring WAL files in FileDisposer (#24611)
* feat(tsm1/wal): encapsulate expiring WAL files in FileDisposer

This changeset introduces an interface extension point named
FileDisposer to control what to do with WAL files when they are no
longer needed. Currently, the only implementation is to delete the file
which is the existing behavior.

* chore: accumulate errors

Since we're here, capture the previously ignored fs errors and pass up a
combined error (which the only callers log out).
2024-01-31 12:46:46 -08:00
Phil Bracikowski 713efbc164
fix(tsi1/partition/test): fix data race in test code (#24613)
* fix(tsi1/partition/test): fix data race in test code

TestPartition_Compact_Write_Fail test was not locking the partition
before changing the value of MaxLogFileSize. This PR exports the mutex
of the partition to allow the test to access it and lock. Alternatives
require more changes such as a Setter method if we need to hide the
mutex.

* fixes #24042, for #24040

* chore: complete renaming of mutex in file and fix flux test

The flux test is another failing test because it was using a relative
time range.
2024-01-30 20:01:20 -08:00
Jack 976ef20a32
fix: panic index out of range for invalid series keys [Port to main-2.x] (#24597)
* fix: cherry-pick to main-2.x
2024-01-23 17:09:10 +00:00
davidby-influx a3fd489864
fix: corrrectly return 4XX errors instead of 5XX errors (#24519)
HTTP 5XX errors were being returned incorrectly from
BoltDB errors that were actually bad requests, e.g., 
names that were too long for buckets, users, and 
organizations. Map BoltDB errors to correct Influx 
errors and return 4XX errors where appropriate. Also 
add op codes to more errors
2023-12-27 08:21:09 -08:00
davidby-influx 081f95147e
fix: avoid SIGBUS when reading non-std series segment files (#24509) (#24520)
Some series files which are smaller than the standard
sizes cause SIGBUS in influx_inspect and influxd, because
entry iteration walks onto mapped memory not backed by the
the file.  Avoid walking off the end of the file while
iterating series entries in oddly sized files.

closes https://github.com/influxdata/influxdb/issues/24508

Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
(cherry picked from commit 969abf3da2)

closes https://github.com/influxdata/influxdb/issues/24511
2023-12-19 15:02:34 -08:00
Jeffrey Smith II c854e53c2b
fix: chmod'ing the manifest is unnecessary (#24165) 2023-04-03 13:09:01 -04:00
Eng Zer Jun 903d30d658
test: use `T.TempDir` to create temporary test directory (#23258)
* test: use `T.TempDir` to create temporary test directory

This commit replaces `os.MkdirTemp` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.

Prior to this commit, temporary directory created using `os.MkdirTemp`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
	defer func() {
		if err := os.RemoveAll(dir); err != nil {
			t.Fatal(err)
		}
	}
is also tedious, but `t.TempDir` handles this for us nicely.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestSendWrite on Windows

=== FAIL: replications/internal TestSendWrite (0.29s)
    logger.go:130: 2022-06-23T13:00:54.290Z	DEBUG	Created new durable queue for replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestSendWrite1627281409\\001\\replicationq\\0000000000000001"}
    logger.go:130: 2022-06-23T13:00:54.457Z	ERROR	Error in replication stream	{"replication_id": "0000000000000001", "error": "remote timeout", "retries": 1}
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestSendWrite1627281409\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestStore_BadShard on Windows

=== FAIL: tsdb TestStore_BadShard (0.09s)
    logger.go:130: 2022-06-23T12:18:21.827Z	INFO	Using data dir	{"service": "store", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestStore_BadShard1363295568\\001"}
    logger.go:130: 2022-06-23T12:18:21.827Z	INFO	Compaction settings	{"service": "store", "max_concurrent_compactions": 2, "throughput_bytes_per_second": 50331648, "throughput_bytes_per_second_burst": 50331648}
    logger.go:130: 2022-06-23T12:18:21.828Z	INFO	Open store (start)	{"service": "store", "op_name": "tsdb_open", "op_event": "start"}
    logger.go:130: 2022-06-23T12:18:21.828Z	INFO	Open store (end)	{"service": "store", "op_name": "tsdb_open", "op_event": "end", "op_elapsed": "77.3µs"}
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestStore_BadShard1363295568\002\data\db0\rp0\1\index\0\L0-00000001.tsl: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestPartition_PrependLogFile_Write_Fail and TestPartition_Compact_Write_Fail on Windows

=== FAIL: tsdb/index/tsi1 TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s)
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_PrependLogFile_Write_Failwrite_MANIFEST656030081\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process.
    --- FAIL: TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s)

=== FAIL: tsdb/index/tsi1 TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s)
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_Compact_Write_Failwrite_MANIFEST3398667527\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process.
    --- FAIL: TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s)

We must close the open file descriptor otherwise the temporary file
cannot be cleaned up on Windows.

Fixes: 619eb1cae6 ("fix: restore in-memory Manifest on write error")
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestReplicationStartMissingQueue on Windows

=== FAIL: TestReplicationStartMissingQueue (1.60s)
    logger.go:130: 2023-03-17T10:42:07.269Z	DEBUG	Created new durable queue for replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"}
    logger.go:130: 2023-03-17T10:42:07.305Z	INFO	Opened replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"}
    testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestReplicationStartMissingQueue76668607\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: update TestWAL_DiskSize

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestWAL_DiskSize on Windows

=== FAIL: tsdb/engine/tsm1 TestWAL_DiskSize (2.65s)
    testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestWAL_DiskSize2736073801\001\_00006.wal: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

---------

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-03-21 16:22:11 -04:00
Jeffrey Smith II f74c69c5e4
chore: update to go 1.20 (#24088)
* build: upgrade to go 1.19

* chore: bump go.mod

* chore: `gofmt` changes for doc comments

https://tip.golang.org/doc/comment

* test: update tests for new sort order

* chore: make generate-sources

* chore: make generate-sources

* chore: go 1.20

* chore: handle rand.Seed deprecation

* chore: handle rand.Seed deprecation in tests

---------

Co-authored-by: DStrand1 <dstrandboge@influxdata.com>
2023-02-09 14:14:35 -05:00
Jeffrey Smith II 8ad6e17265
chore: add additional error logging when deleting shard (#24038)
* chore: add additional error logging when deleting shard

* chore: better logging message
2023-02-09 09:10:25 -05:00
davidby-influx 7ad8fbad22
chore: fix trace message text (#23918) 2022-11-16 08:40:26 -05:00
Sam Arnold 4de89afd37
refactor: remove dead iterator code (#23887)
* fix: codegen without needing goimports

* refactor: remove dead code
2022-11-09 19:26:12 -05:00
Jeffrey Smith II 2ad8995355
fix: improve delete speed when a measurement is part of the predicate (#23786)
* fix: improve delete speed when a measurement is part of the predicate

* test: add test for deleting measurement by predicate

* chore: improve error messaging and capturing

* chore: set goland to use the right formatting style
2022-10-14 15:09:32 -04:00
davidby-influx b72848d436
feat: optimize saving changes to fields.idx (#23701) (#23728)
Instead of writing out the complete fields.idx
file when it changes, write out incremental
changes that will be applied to the file on
close and startup.

closes https://github.com/influxdata/influxdb/issues/23653

(cherry picked from commit 80c10c8c04)

closes https://github.com/influxdata/influxdb/issues/23703
2022-09-15 12:15:14 -07:00
Abirdcfly c433342830
chore: remove duplicate word in comments (#23685)
Signed-off-by: Abirdcfly <fp544037857@gmail.com>

Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-09-13 11:00:52 -05:00
davidby-influx 619eb1cae6
fix: restore in-memory Manifest on write error (#23552) (#23578)
Do not update the `FileSet` or `activeLogFile` field in the in-memory
Partition structure if the Manifest file is not correctly saved to
the disk.

closes https://github.com/influxdata/influxdb/issues/23553

(cherry picked from commit a8732dcf52)

closes https://github.com/influxdata/influxdb/issues/23554
2022-07-25 10:53:09 -07:00
davidby-influx f762346ecc
fix: add paths to tsi log and index file errors (#23557) (#23562)
Add paths to various TSI errors on opening and unmarshaling files
to help poinpoint the corrupt files.

Closes https://github.com/influxdata/influxdb/issues/23556

(cherry picked from commit 25cea95beb)

closes https://github.com/influxdata/influxdb/issues/23558
2022-07-19 15:45:42 -07:00
davidby-influx 00edb77253
fix: create TSI MANIFEST files atomically (#23539) (#23546)
When a MANIFEST file is created in TSI, it
should be written to a temp file, then
atomically renamed, to avoid overwriting
the existing file only to fail on the
later write.

closes https://github.com/influxdata/influxdb/issues/23536

(cherry picked from commit 061cf55f2a)

closes https://github.com/influxdata/influxdb/issues/23538
2022-07-14 09:13:11 -07:00
davidby-influx 4789d5402a
fix: improve error messages opening index partitions (#23532) (#23535)
Where possible, add the file path path to any errors
on opening, reading, (un)marshaling, or validating
the various files comprising a partition

closes https://github.com/influxdata/influxdb/issues/23506

(cherry picked from commit a2dd708a26)

closes https://github.com/influxdata/influxdb/issues/23534
2022-07-13 13:20:47 -07:00
Dane Strandboge 8bd4fc502d
fix: lost TSI reference / close TagValueSeriesIDIterator in error case (#23461) 2022-06-16 13:35:45 -05:00
davidby-influx 53580ead1d
fix: remember shards that fail Open(), avoid repeated attempts (#23437) (#23455)
If a shard cannot be opened, store its ID and last error.
Prevent future attempts to open during this invocation of
influxDB. This information is not persisted.

closes https://github.com/influxdata/influxdb/issues/23428
closes https://github.com/influxdata/influxdb/issues/23426

(cherry picked from commit 54ac7e54ed)

closes https://github.com/influxdata/influxdb/issues/23434
closes https://github.com/influxdata/influxdb/issues/23436
2022-06-14 13:01:11 -07:00
davidby-influx a9df3f8a7c
fix: fully clean up partially opened TSI (#23430) (#23454)
When one partition in a TSI fails to open, all previously opened
partitions should be cleaned up, and remaining partitions
should not be opened

closes https://github.com/influxdata/influxdb/issues/23427

(cherry picked from commit d3db48e93d)

closes https://github.com/influxdata/influxdb/issues/23432
2022-06-14 11:49:16 -07:00
davidby-influx 8c9768cdb7
fix: replace unprintable and invalid characters in errors (#23387) (#23395)
Replace unprintable and invalid characters with '?'
in logged errors.  Truncate consecutive runs of them to
only 3 repeats of '?'

closes https://github.com/influxdata/influxdb/issues/23386

(cherry picked from commit 0ae0bd6e2e)

closes https://github.com/influxdata/influxdb/issues/23389
2022-06-01 14:42:51 -07:00
Geoffrey Wossum 30a9fd43f6
fix: MeasurementsCardinality should not be less than 0 (#23304)
Clamp the value of Store.MeasurementsCardinality so that it can not be less
than 0. This primarily shows up as a negative numMeasurements value in
/debug/vars under some circumstances.

refs #23285

(cherry picked from commit 160cf678d5)
2022-04-26 23:37:09 -05:00
Dane Strandboge 82d1123e78
build: upgrade to Go 1.18.1 (#23252) 2022-04-13 15:24:27 -05:00
Andrew Charlton 4e08604e48
feat: Add MeasurementNames method to MeasurementFieldSet (#23173) 2022-03-15 10:21:38 +00:00