Commit Graph

2775 Commits (1.11)

Author SHA1 Message Date
WeblWabl a501bb11b0
feat: Upgrade flux to v0.196.1 (#26041) (#26060)
* feat: Upgrade flux to v0.196.1 (#26041)

* feat: update flux to 0.196.1

* feat: Update proto files
This updates from protoc-gen-go v1.33.0 -> v1.34.1
and protoc from v5.26.1 -> v5.29.2

(cherry picked from commit 03b6ed2bed)
2025-02-24 14:36:53 -06:00
WeblWabl 1fd1ad63b9
fix(tsi1/partition/test): fix data races in test code (#57) (#25338) (#25345)
* feat: add hook for optimizing series reads based on authorizer (#25207)

* fix(tsi1/partition/test): fix data races in test code (#57) (#25338)

* fix(tsi1/partition/test): fix data races in test code (#57)

* fix(tsi1/partition/test): fix data races in test code

This PR is like influxdata/influxdb#24613 but solves it with a setter
method for MaxLogFileSize which allows unexporting that value and
MaxLogFileAge. There are actually two places locks were needed in test
code. The behavior of production code is unchanged.

(cherry picked from commit f0235c4daf4b97769db932f7346c1d3aecf57f8f)

* feat: modify error handling to be more idiomatic

closes https://github.com/influxdata/influxdb/issues/24042

* fix: errors.Join() filters nil errors

---------

Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
(cherry picked from commit 5c9e45f033)

---------

Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
2024-09-17 13:09:03 -05:00
Geoffrey Wossum 733faa40d5
chore: improve error messages and logging during shard opening (#25323)
(cherry picked from commit 23008e5286)

Closes: #25320
2024-09-12 15:59:05 -05:00
davidby-influx 2a6c42ace6
fix: add additional logging on loading fields.idxl files (#25309) (#25318)
Log the path of the file being loaded, and when level=debug
log progress fpr each set of field changes

closes https://github.com/influxdata/influxdb/issues/25289

(cherry picked from commit 5d8d1120e1)

closes https://github.com/influxdata/influxdb/issues/25310
2024-09-12 11:38:39 -07:00
WeblWabl ee51e5c54c
fix(tsi1/partition/test): fix data race in test code (#25298)
* feat: add hook for optimizing series reads based on authorizer (#25207)

* fix(edge): Backporting #24613 in to 1.11

* fix(test): fixes typo from merge in test

* fix(testing): adds error checking to cleanups

---------

Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
2024-09-10 12:07:13 -05:00
WeblWabl a709074fb3
feat: add hook for optimizing series reads based on authorizer (#25207) (#25283) 2024-09-04 18:40:00 -05:00
davidby-influx 1103783870
fix(tsi1): fix data race between appendEntry and FlushAndSync tsi1.(*LogFile) (#25182) (#25243)
Extend lock lifespan to encompass the
flushAndSync() call to avoid a race

closes https://github.com/influxdata/influxdb/issues/25181

(cherry picked from commit 7333da9592)

closes https://github.com/influxdata/influxdb/issues/25186

Co-authored-by: Shiwen Cheng <chengshiwen0103@gmail.com>
2024-08-12 18:11:23 -07:00
davidby-influx 01a35eb79a
fix: prevent an infinite loop in measurementFieldSetChangeMgr (#25155) (#25241)
The measurementFieldSetChangeMgr has a possibly infinite loop
if the writeRequests channel is closed while in the inner
loop to consolidate write requests. We need to check for ok
on channel receive and exit the loop when ok is false.

closes https://github.com/influxdata/influxdb/issues/25151

(cherry picked from commit 176fca2138)

closes https://github.com/influxdata/influxdb/issues/25152
2024-08-12 18:09:21 -07:00
Geoffrey Wossum c7a591b57c
fix: prevent retention service from hanging (#25064)
Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.

The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.

If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.

closes: #25063
(cherry picked from commit b4bd607eef)
2024-06-13 16:08:02 -05:00
davidby-influx 8caeff7d0c
fix: ensure TSMBatchKeyIterator and FileStore close all TSMReaders (#24957) (#24963)
Do not let errors on closing
a TSMReader prevent other
closes.

(cherry picked from commit 82cbdb5478)

closes https://github.com/influxdata/influxdb/issues/24960
2024-05-06 10:45:00 -07:00
Brandon Pfeifer 61f6d39b08
chore: upgrade protocol buffers to v5.26.1 (#24950) 2024-05-01 11:06:56 -07:00
davidby-influx 727eb09cc0
fix: return and respect cursor errors (#24791) (#24894)
ArrayCursors were ignoring errors, which led to panics when nil
cursors were operated on. This fix passes errors back up the stack
and uses them to enforce healthy cursor creation.

Closes https://github.com/influxdata/influxdb/issues/24789

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>

(cherry picked from commit fe6c64b21e)

closes https://github.com/influxdata/influxdb/issues/24824
2024-04-08 12:55:19 -07:00
davidby-influx ea7c421e51
fix: improved shard deletion (#24602) (#24893)
Avoid unnecessarily deleting series from the series file
Try harder to delete series from InMem indices
Log all errors on shard deletion

Closes https://github.com/influxdata/influxdb/issues/24834

(cherry picked from commit 8ff06d5a92)

closes https://github.com/influxdata/influxdb/issues/24835
2024-04-08 12:54:43 -07:00
Jakub Bednář 3bf7e94641
build(deps): upgrade google.golang.org/protobuf to v1.33.0 (master-1.11) (#24839) 2024-03-26 14:07:34 +01:00
davidby-influx 31ad53ec6e
fix: do not panic when empty tags are queried (#24784) (#24785)
Do not panic if a cursor array is nil and the number
of timestamps is retrieved.

closes https://github.com/influxdata/influxdb/issues/24536

(cherry picked from commit bc80e881fa)
2024-03-18 15:47:40 -07:00
Jack 8b9a9c63c5
fix: panic index out of range for invalid series keys (#24565) (#24595)
* chore: add scaffolding for naive solution

* feat: test case scaffolding

* fix: implement check for series key before proceeding

* fix: add validation for ReadSeriesKeyMeasurement usage

* refactor: explicit use of series key len

* feat: add remaining check to index

* feat: add check to remaining files

As the Len function is used as part of the parseSeriesKey, this also needs to be accounted for on the nil return from this function as it is used in different contexts

* feat: expand test cases

* chore: go fmt

* chore: update test failure message

* chore: impl feedback on unnecessary sz checks

* feat: expand test cases

* fix: nil series key check

In both sections for index.go there is a pre-existing length check against the series key which should catch invalid values, perhaps this explains why it hasn't cropped up in the reported panics. For even more safety, we can also skip a nil key because we know that subsequent calls will cause a panic where this key is attempted to be used

* fix: remove nil tags check

A key with no tags is valid, so we should not check for BOTH nil key and tags as a key could be nil, which is invalid, yet still have tags and therefore cause the check to pass which we do not want

* feat: extend test cases from feedback

* fix: extend checks for CompareSeriesKeys

* feat: add nilKeyHandler for shared key checking logic

* fix: logical error in nilKeyHandler

Prior to this, the else was always defaulted to at the end of the conditional branch, which causes unexpected behaviour and a failure of a bunch of tests.

* fix: return tags keep nil data

In a recent change to this, we agreed on a simple name == nil check for the actual data. As a follow on to this, I just realised that we don't actually want to nil back the tags, even if they're not checked, because having no tags is a valid input so we can simply return whatever we were passed unchanged.

* fix: use len == 0 for extra safety

* feat: extra test for blank series key
2024-01-23 17:07:52 +00:00
davidby-influx c7e3dc1171
chore: upgrade flux (#24505)
* chore: upgrade flux

* chore: execute "go generate" inside cross-builder (#24583)

---------

Co-authored-by: Brandon Pfeifer <bpfeifer@influxdata.com>
2024-01-19 17:37:14 -05:00
davidby-influx 7116861f37
fix: avoid SIGBUS when reading non-std series segment files (#24509) (#24515)
Some series files which are smaller than the standard
sizes cause SIGBUS in influx_inspect and influxd, because
entry iteration walks onto mapped memory not backed by the
the file.  Avoid walking off the end of the file while
iterating series entries in oddly sized files.

closes https://github.com/influxdata/influxdb/issues/24508

Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
(cherry picked from commit 969abf3da2)

closes https://github.com/influxdata/influxdb/issues/24510
2023-12-11 16:22:32 -08:00
davidby-influx 734d25f3fd
fix: do not escape CSV output (#24311) (#24313)
CSV output is incorrectly escaped.
Add a boolean flag to tag output
functions to prevent this.

closes https://github.com/influxdata/influxdb/issues/24309

(cherry picked from commit 2dc3dcb3d1)

closes https://github.com/influxdata/influxdb/issues/24310
2023-06-29 13:35:38 -07:00
davidby-influx a03051adee
fix: series file index compaction (#23916) (#24259)
Series file indices monotonically grew even
when series were deleted.  Also stop
ignoring error in series index recovery

Partially closes https://github.com/influxdata/EAR/issues/3643

(cherry picked from commit 53856cdaae)
2023-06-01 10:58:52 -07:00
davidby-influx eaa4c95de2
fix: prevent world-writable MANIFEST files (#24235) (#24236)
When a new MANIFEST file is created, set
its permissions to 644, not 666

closes https://github.com/influxdata/influxdb/issues/24233

(cherry picked from commit aad79e471f)

closes https://github.com/influxdata/influxdb/issues/24234
2023-05-18 12:20:57 -07:00
Brandon Pfeifer e484c4d871
chore: upgrade Go to v1.19.3 (1.x) (#23941)
* chore: upgrade Go to 1.19.3

This re-runs ./generate.sh and ./checkfmt.sh to format and update
source code (this is primarily responsible for the huge diff.)

* fix: update tests to reflect sorting algorithm change
2022-11-28 12:15:47 -05:00
davidby-influx fd7e4aa0f7
chore: fix trace message text (#23917) 2022-11-16 08:40:10 -05:00
Brandon Pfeifer 5976e41d54
feat: upgrade flux to v0.188.0 (#23911)
* feat: upgrade flux to 0.171.0

Tests failing, safety commit

First step in https://github.com/influxdata/influxdb/issues/23815

* fix: remove "org" parameter" from writeOptSource

I attempted to implement the "orgOpt" argument in a similar fashion
to f6669f7512. However, it looks like Flux doesn't accept "org" as
a parameter to "load". It responds with:

Error calling function \"load\" @113:16-113:30: error calling function \"to\" @6:19-6:47: unused arguments [org]

This brings us from 194 passing to 570 passing.

* fix: temporarily disable broken flux tests

These tests expect rows to be stored in a certain order. However,
nothing is specifying the sort order. This has been fixed in a
later update to flux: (see 3d6f47ded).

Temporarily disable these tests until we include a fixed
version of the flux tests.

* chore: add tests from a492993012

This fixes "test-flux.sh" so it runs tests within the "flux/"
directory. This uncovered some other issues with the tests
located within "flux/". These also needed to be updated
to match the newer flux API.

* feat: upgrade flux to 0.172.0

This includes changes made in "cbbf4b27da". Since "test.go" in 2.x
diverged from 1.x, some modifications were required to make this
compatible.

* feat: upgrade flux to 0.173.0

* feat: upgrade flux to v0.174.0

* fix: Update the condition when reseting cursor (#23522)

Filters that contain `or` may change between cursor resets so we must remember to update the condition in the read cursor.

```flux
|> filter(fn: (r) => ((r["_field"] == "field1" and r["_value"]==true) or (r["_field"] == "field2" and r["_value"] == false)))
```

Closes https://github.com/influxdata/flux/issues/4804

* feat: upgrade flux to 0.174.1

* feat: upgrade flux to 0.175.0

* chore: remove end-to-end tests

These were removed in a492993 for 2.x. These tests prevent "go test ./..."
from completing. As stated in the original commit, these tests should now be
handled by the "fluxtest" harness.

* feat: upgrade flux to 0.176.0

Some tests needed to be disabled within the flux harness. This is a
result of enabling "Optimize Aggregate Window" in flux@05a1065f.
These tests are not present in 2.x. Therefore, I am unsure if
the breakage is resolved in a later commit.

* feat: upgrade flux to 0.177.0

* feat: upgrade flux to 0.178.0

* feat: upgrade flux to v0.179.0

This removes all invocations of "flux.RegisterOpSpec". According
to flux@e39096d5, "flux.RegisterOpSpec" does nothing in the
current version of flux and was removed.

* chore: update fluxtest skip list (#23633)

* chore: manually backport 785a465e9a

This removes the reference to "flux.Spec".

* build(flux): update flux to v0.181.0 (#23682)

* build(flux): update flux to v0.184.2

* chore: skip more Flux acceptance tests

There are issues for each skip detailed in test-flux.sh.

* feat: upgrade flux to v0.185.0

This adds "FluxTesting" to the "HTTPD" configuration. This option is
hidden and disabled by default. When "FluxTesting" is set, it
enables the default testing flags for "Flux".

These flags allow the "vectorized float tests" and tests requiring
the "removeRedundantSortNodes" and "labelPolymorphism" flag
enabled to work. These changes are based off of d8553c002e.

flux@3d6f47ded is included within this version of Flux. Therefore
we can now include the "group_*" tests.

* feat: upgrade flux to 0.186.0

* feat: upgrade flux to 0.187.0

* feat: upgrade flux to 0.188.0

* fix: re-run ./generate.sh with updated protoc

* fix: restrict cores to match CircleCI documentation

Co-authored-by: davidby-influx <dbyrne@influxdata.com>
Co-authored-by: Markus Westerlind <marwes91@gmail.com>
Co-authored-by: Sean Brickley <sean@wabr.io>
Co-authored-by: Jonathan A. Sternberg <jonathan@influxdata.com>
Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>
2022-11-15 15:20:27 -05:00
Sam Arnold 9e9f1be574
fix: remove dead iterator (#23888) 2022-11-09 16:24:01 -05:00
davidby-influx cc26b7653c
fix: remove breaking argument validation for _fieldKeys iterator (#23875)
New argument validation code for _fieldKeys system iterator 
broke Enterprise tests because it is misused all over the 
place. Back out the safety check.
2022-11-09 09:04:44 -08:00
davidby-influx f5da0f50f4
fix: Optimize SHOW FIELD KEY CARDINALITY (#23871)
Use the _fieldKeys system iterator

closes https://github.com/influxdata/influxdb/issues/23840
2022-11-08 08:32:10 -08:00
davidby-influx b17f27a5d9
fix: incorrect error message concatenation (#23729) 2022-09-15 09:26:51 -07:00
davidby-influx 80c10c8c04
feat: optimize saving changes to fields.idx (#23701)
Instead of writing out the complete fields.idx
file when it changes, write out incremental
changes that will be applied to the file on
close and startup.

closes https://github.com/influxdata/influxdb/issues/23653
2022-09-14 13:14:09 -07:00
davidby-influx 84c4f676b0
feat: add type conflict checker to influx_inspect (#23616)
adds two commands "check-schema" and
"merge-schema" to influx_inspect.
These test for field type conflicts
in all fields.idx beneath a directory
and merges the derived schemas if
"check-schema" has been run multiple
times on different directories
2022-08-10 09:36:58 -07:00
davidby-influx eb3cc88772
fix: generalize test for Windows (#23580)
Also eliminate race condition in tests

(cherry picked from commit 7e37a7ad16)
2022-07-21 13:28:10 -07:00
davidby-influx a8732dcf52
fix: restore in-memory Manifest on write error (#23552)
Do not update the `FileSet` or `activeLogFile` field in the in-memory
Partition structure if the Manifest file is not correctly saved to
the disk.

closes https://github.com/influxdata/influxdb/issues/23553
2022-07-20 12:59:15 -07:00
davidby-influx 25cea95beb
fix: add paths to tsi log and index file errors (#23557)
Add paths to various TSI errors on opening and unmarshaling files
to help poinpoint the corrupt files.

Closes https://github.com/influxdata/influxdb/issues/23556
2022-07-19 09:02:20 -07:00
davidby-influx 061cf55f2a
fix: create TSI MANIFEST files atomically (#23539)
When a MANIFEST file is created in TSI, it
should be written to a temp file, then
atomically renamed, to avoid overwriting
the existing file only to fail on the
later write.

closes https://github.com/influxdata/influxdb/issues/23536
2022-07-13 10:11:49 -07:00
davidby-influx a2dd708a26
fix: improve error messages opening index partitions (#23532)
Where possible, add the file path path to any errors
on opening, reading, (un)marshaling, or validating
the various files comprising a partition

closes https://github.com/influxdata/influxdb/issues/23506
2022-07-12 14:22:36 -07:00
davidby-influx a428043f84
fix: lost TSI reference / close TagValueSeriesIDIterator in error case (#23461) (#23462)
(cherry picked from commit 8bd4fc502d)

closes https://github.com/influxdata/influxdb/issues/23460

Co-authored-by: Dane Strandboge <dstrandboge@influxdata.com>
2022-06-16 11:54:04 -07:00
davidby-influx 54ac7e54ed
fix: remember shards that fail Open(), avoid repeated attempts (#23437)
If a shard cannot be opened, store its ID and last error.
Prevent future attempts to open during this invocation of
influxDB. This information is not persisted.

closes https://github.com/influxdata/influxdb/issues/23428
closes https://github.com/influxdata/influxdb/issues/23426
2022-06-13 10:32:47 -07:00
davidby-influx d3db48e93d
fix: fully clean up partially opened TSI (#23430)
When one partition in a TSI fails to open, all previously opened
partitions should be cleaned up, and remaining partitions 
should not be opened

closes https://github.com/influxdata/influxdb/issues/23427
2022-06-10 11:31:29 -07:00
davidby-influx ec412f793b
fix: do not rename files on mmap failure (#23396)
If NewTSMReader() fails because mmap fails, do not
rename the file, because the error is probably
caused by vm.max_map_count being too low

closes https://github.com/influxdata/influxdb/issues/23172
2022-06-07 08:37:00 -07:00
davidby-influx 0ae0bd6e2e
fix: replace unprintable and invalid characters in errors (#23387)
Replace unprintable and invalid characters with '?'
in logged errors.  Truncate consecutive runs of them to
only 3 repeats of '?'

closes https://github.com/influxdata/influxdb/issues/23386
2022-06-01 13:45:24 -07:00
Geoffrey Wossum 160cf678d5
fix: MeasurementsCardinality should not be less than 0 (#23286)
Clamp the value of Store.MeasurementsCardinality so that it can not be less
than 0. This primarily shows up as a negative `numMeasurements` value in
/debug/vars under some circumstances.

refs #23285
2022-04-21 13:32:12 -05:00
Dane Strandboge 0574163566
build: upgrade to go1.18 (#23250) 2022-03-31 16:17:57 -05:00
davidby-influx 7d182158f4
fix: add database to MaxSeriesPerDatabase error message (#23113)
To simplify debugging, print the database name when the
max-series-per-database limit is exceeded in InMem indices.

closes https://github.com/influxdata/influxdb/issues/23112
2022-02-08 11:52:14 -08:00
davidby-influx f27df39c03
fix: add additional testing for MaxSeriesPerDatabase (#23094)
Added test to ensure new code path taken for inmem index
2022-02-02 13:16:09 -08:00
davidby-influx 0c3dca883e
fix: correctly handle MaxSeriesPerDatabaseExceeded (#23091)
Check for the correctly returned PartialWriteError
in (*shard).validateSeriesAndFields, allow partial
writes.

closes https://github.com/influxdata/influxdb/issues/23090
2022-02-01 19:08:51 -08:00
davidby-influx eb3bc7069f
feat: configurable DELETE concurrency (#23055)
Currently, deletion of series or measurements are
serialized. This new feature will add
max-concurrent-deletes to the [data] section of the
 configuration file. Legal values are any positive
 number, defaulting to 1, the current behavior.

 closes https://github.com/influxdata/influxdb/issues/23054
2022-01-13 11:04:57 -08:00
lifeibo 5be1c044c3
fix(tsi): sync index file before close (#21932) 2021-11-24 08:36:03 -05:00
Geoffrey Wossum 91609fdd3f
fix(restore): fix race condition which causes restore command to fail (#22796)
* fix(restore): fix race condition which causes restore command to fail

Fixes a race condition in the restore code path that causes shard data restores
to fail. When the bug occurs, `Error while freeing cold shard resources`
appears in the log files.

fixes issue #15323
2021-11-03 14:21:33 -05:00
davidby-influx af9e89a4d4
fix: detect misquoted tag values and return an error (#22754)
SHOW TAG KEYS FROM "foo" where bar="misquoted" is
erroneous, because the tag value must be enclosed
in single, not double, quotes. Although this
correctly returns no tag keys, it is very
inefficient and has cause out-of-memory failures
at a customer. This fix short-circuits the query.

closes https://github.com/influxdata/influxdb/issues/22755
2021-10-27 11:26:20 -07:00
davidby-influx d9b9e86db9
fix: extend snapshot copy to filesystems that cannot link (#22703)
If os.Link fails with syscall.ENOTSUP, then the file
system does not support links, and we must make copies
to snapshot files for backup. We also automatically make
copies instead of link on Windows, because although it
makes links, their semantics are different from Linux.

closes https://github.com/influxdata/influxdb/issues/16739
2021-10-21 12:53:26 -07:00