influxdb

Commit Graph

Author	SHA1	Message	Date
Geoffrey Wossum	aed9e976ed	chore: fix logging issues in Store.loadShards Fix reporting shards not opening correctly when they actually did. Fix race condition with logging in loadShards.	2024-11-08 17:29:49 -06:00
Geoffrey Wossum	0bc167bbd7	chore: loadShards changes to more cleanly support 2.x feature (#25513 ) * chore: move shardID parsing and shard filtering into walkShardsAndProcess * chore: make it impossible to miss sending shardResponse or marking shard as complete * chore: always count number of shards (preparation for 2.x related feature) * chore: explicitly load series files and create indices serially Explicitly load series files and create indices serially. Also avoid passing them to work functions that don't need them. * chore: rework loadShards for changes necessary to cancel loading process * chore: comment improvements * fix: fix race conditions in TestStore_StartupShardProgress and TestStore_BadShardLoading * chore: avoid logging nil error * chore: refactor shard loading and shard walking Refactor loadShards and CreateShard to use a common shardLoader class that makes thread-safety easier. Refactor walkShardsAndProcess into findShards. * chore: improve comment * chore: rename OpenShard to ReopenShard and implement with shardLoader Rename Store.OpenShard to Store.ReopenShard and implement using a shardLoader object. Changes to tests as necessary. * chore: avoid resetting shard options and locking on Reopen Avoid resetting shard options when reopening a shard. Proper mutex locker in Shard.ReopenShard. * chore: fix formatting issue * chore: warn on mixed index types in Store.CreateShard * chore: change from info to warn when invalid shard IDs found in path * chore: use coarser locking in Store.ReopenShard * chore: fix typo in comment * chore: code simplification	2024-11-08 15:49:48 -06:00
WeblWabl	2cab9a2a1f	feat: Adds functionality to clear out bad shard list (#25398 ) * feat(tsdb): Adds functionality to clear bad shards list This PR adds test and new method to clear out the bad shards list the method will return the values of the shards that it cleared out along with the errors. This is the first part in the feature for adding a load-shards command to influxd-ctl. Closes influxdata/feature-requests#591	2024-10-18 13:22:32 -05:00
WeblWabl	3c87f524ed	feat(logging): Add startup logging for shard counts (#25378 ) * feat(tsdb): Adds shard opening progress checks to startup This PR adds a check to see how many shards are remaining vs how many shards are opened. This change displays the percent completed too. closes influxdata/feature-requests#476	2024-10-16 10:09:15 -05:00
Shiwen Cheng	1bc0eb4795	fix(tsm1): Fix data race of seriesKeys in deleteSeriesRange (#25268 ) Add an RWMutex to allow safe concurrent access in deleteSeriesRange	2024-09-27 16:36:27 -07:00
WeblWabl	8eaa24d813	feat(tsm): Allow for deletion of series outside default rp (#25312 ) * feat(tsm): Allow for deletion of series outside default RP 9d116f6 This PR adds the ability for deletion of series that are outside of the default retention policy. This updates InfluxQL to include changes from: influxdata/influxql#71 closes: influxdata/feature-requests#175 * feat(tsm): Allow for deletion of series outside default RP 9d116f6 This PR adds the ability for deletion of series that are outside of the default retention policy. This updates InfluxQL to include changes from: influxdata/influxql#71 closes: influxdata/feature-requests#175	2024-09-17 16:34:14 -05:00
WeblWabl	5c9e45f033	fix(tsi1/partition/test): fix data races in test code (#57 ) (#25338 ) * fix(tsi1/partition/test): fix data races in test code (#57) * fix(tsi1/partition/test): fix data races in test code This PR is like influxdata/influxdb#24613 but solves it with a setter method for MaxLogFileSize which allows unexporting that value and MaxLogFileAge. There are actually two places locks were needed in test code. The behavior of production code is unchanged. (cherry picked from commit f0235c4daf4b97769db932f7346c1d3aecf57f8f) * feat: modify error handling to be more idiomatic closes https://github.com/influxdata/influxdb/issues/24042 * fix: errors.Join() filters nil errors --------- Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>	2024-09-16 20:26:14 -05:00
Geoffrey Wossum	23008e5286	chore: improve error messages and logging during shard opening (#25314 ) * chore: improve error messages and logging during shard opening	2024-09-12 15:11:56 -05:00
davidby-influx	5d8d1120e1	fix: add additional logging on loading fields.idxl files (#25309 ) Log the path of the file being loaded, and when level=debug log progress fpr each set of field changes closes https://github.com/influxdata/influxdb/issues/25289	2024-09-12 08:25:02 -07:00
WeblWabl	7dc8b1d648	fix(tsi1/partition/test): fix data race in test code (#25288 )	2024-09-11 19:48:41 -05:00
Geoffrey Wossum	2cf2103cc4	feat: add hook for optimizing series reads based on authorizer (#25207 )	2024-08-02 15:03:44 -05:00
Shiwen Cheng	7333da9592	fix(tsi1): fix data race between appendEntry and FlushAndSync tsi1.(*LogFile) (#25182 ) Extend lock lifespan to encompass the flushAndSync() call to avoid a race closes https://github.com/influxdata/influxdb/issues/25181	2024-07-23 14:40:10 -07:00
davidby-influx	176fca2138	fix: prevent an infinite loop in measurementFieldSetChangeMgr (#25155 ) The measurementFieldSetChangeMgr has a possibly infinite loop if the writeRequests channel is closed while in the inner loop to consolidate write requests. We need to check for ok on channel receive and exit the loop when ok is false. closes https://github.com/influxdata/influxdb/issues/25151	2024-07-12 16:52:28 -07:00
Geoffrey Wossum	b4bd607eef	fix: prevent retention service from hanging (#25055 ) * fix: prevent retention service from hanging Fix issue that can cause the retention service to hang waiting on a `Shard.Close` call. When this occurs, no other shards will be deleted by the retention service. This is usually noticed as an increase in disk usage because old shards are not cleaned up. The fix adds to new methods to `Store`, `SetShardNewReadersBlocked` and `InUse`. `InUse` can be used to poll if a shard has active readers, which the retention service uses to skip over in-use shards to prevent the service from hanging. `SetShardNewReadersBlocked` determines if new read access may be granted to a shard. This is required to prevent race conditions around the use of `InUse` and the deletion of shards. If the retention service skips over a shard because it is in-use, the shard will be checked again the next time the retention service is run. It can be deleted on subsequent checks if it is no longer in-use. If the shards is stuck in-use, the retention service will not be able to delete the shards, which can be observed in the logs for manual intervention. Other shards can still be deleted by the retention service even if a shard is stuck with readers. closes: #25054	2024-06-13 11:07:17 -05:00
davidby-influx	82cbdb5478	fix: ensure TSMBatchKeyIterator and FileStore close all TSMReaders (#24957 ) Do not let errors on closing a TSMReader prevent other closes.	2024-05-06 09:59:30 -07:00
Brandon Pfeifer	d4b16dcd98	chore: upgrade protocol buffers to v5.26.1 (#24949 )	2024-05-01 11:00:26 -07:00
Jakub Bednář	dbbe4611c0	build(deps): upgrade google.golang.org/protobuf to v1.33.0 (master-1.x) (#24818 )	2024-03-26 14:07:28 +01:00
davidby-influx	fe6c64b21e	fix: return and respect cursor errors (#24791 ) ArrayCursors were ignoring errors, which led to panics when nil cursors were operated on. This fix passes errors back up the stack and uses them to enforce healthy cursor creation. Closes https://github.com/influxdata/influxdb/issues/24789 --------- Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>	2024-03-25 17:22:33 -07:00
davidby-influx	8ff06d5a92	fix: improved shard deletion (#24602 ) Avoid unnecessarily deleting series from the series file Try harder to delete series from InMem indices Log all errors on shard deletion Closes https://github.com/influxdata/influxdb/issues/24834	2024-03-25 17:15:31 -07:00
davidby-influx	bc80e881fa	fix: do not panic when empty tags are queried (#24784 ) Do not panic if a cursor array is nil and the number of timestamps is retrieved. closes https://github.com/influxdata/influxdb/issues/24536	2024-03-18 15:28:29 -07:00
Jack	6af0be9234	fix: panic index out of range for invalid series keys (#24565 ) * chore: add scaffolding for naive solution * feat: test case scaffolding * fix: implement check for series key before proceeding * fix: add validation for ReadSeriesKeyMeasurement usage * refactor: explicit use of series key len * feat: add remaining check to index * feat: add check to remaining files As the Len function is used as part of the parseSeriesKey, this also needs to be accounted for on the nil return from this function as it is used in different contexts * feat: expand test cases * chore: go fmt * chore: update test failure message * chore: impl feedback on unnecessary sz checks * feat: expand test cases * fix: nil series key check In both sections for index.go there is a pre-existing length check against the series key which should catch invalid values, perhaps this explains why it hasn't cropped up in the reported panics. For even more safety, we can also skip a nil key because we know that subsequent calls will cause a panic where this key is attempted to be used * fix: remove nil tags check A key with no tags is valid, so we should not check for BOTH nil key and tags as a key could be nil, which is invalid, yet still have tags and therefore cause the check to pass which we do not want * feat: extend test cases from feedback * fix: extend checks for CompareSeriesKeys * feat: add nilKeyHandler for shared key checking logic * fix: logical error in nilKeyHandler Prior to this, the else was always defaulted to at the end of the conditional branch, which causes unexpected behaviour and a failure of a bunch of tests. * fix: return tags keep nil data In a recent change to this, we agreed on a simple name == nil check for the actual data. As a follow on to this, I just realised that we don't actually want to nil back the tags, even if they're not checked, because having no tags is a valid input so we can simply return whatever we were passed unchanged. * fix: use len == 0 for extra safety * feat: extra test for blank series key	2024-01-23 09:44:29 +00:00
davidby-influx	c05b340b72	chore: upgrade flux (#24504 ) * chore: upgrade flux * chore: execute "go generate" inside cross-builder (#24582) --------- Co-authored-by: Brandon Pfeifer <bpfeifer@influxdata.com>	2024-01-19 17:40:48 -05:00
davidby-influx	969abf3da2	fix: avoid SIGBUS when reading non-std series segment files (#24509 ) Some series files which are smaller than the standard sizes cause SIGBUS in influx_inspect and influxd, because entry iteration walks onto mapped memory not backed by the the file. Avoid walking off the end of the file while iterating series entries in oddly sized files. closes https://github.com/influxdata/influxdb/issues/24508 Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>	2023-12-08 15:46:11 -08:00
davidby-influx	2dc3dcb3d1	fix: do not escape CSV output (#24311 ) CSV output is incorrectly escaped. Add a boolean flag to tag output functions to prevent this. closes https://github.com/influxdata/influxdb/issues/24309	2023-06-29 12:00:41 -07:00
davidby-influx	53856cdaae	fix: series file index compaction (#23916 ) Series file indices monotonically grew even when series were deleted. Also stop ignoring error in series index recovery Partially closes https://github.com/influxdata/EAR/issues/3643	2023-06-01 10:49:23 -07:00
davidby-influx	aad79e471f	fix: prevent world-writable MANIFEST files (#24235 ) When a new MANIFEST file is created, set its permissions to 644, not 666 closes https://github.com/influxdata/influxdb/issues/24233	2023-05-18 12:07:34 -07:00
Brandon Pfeifer	e484c4d871	chore: upgrade Go to v1.19.3 (1.x) (#23941 ) * chore: upgrade Go to 1.19.3 This re-runs ./generate.sh and ./checkfmt.sh to format and update source code (this is primarily responsible for the huge diff.) * fix: update tests to reflect sorting algorithm change	2022-11-28 12:15:47 -05:00
davidby-influx	fd7e4aa0f7	chore: fix trace message text (#23917 )	2022-11-16 08:40:10 -05:00
Brandon Pfeifer	5976e41d54	feat: upgrade flux to v0.188.0 (#23911 ) * feat: upgrade flux to 0.171.0 Tests failing, safety commit First step in https://github.com/influxdata/influxdb/issues/23815 * fix: remove "org" parameter" from writeOptSource I attempted to implement the "orgOpt" argument in a similar fashion to `f6669f7512`. However, it looks like Flux doesn't accept "org" as a parameter to "load". It responds with: Error calling function \"load\" @113:16-113:30: error calling function \"to\" @6:19-6:47: unused arguments [org] This brings us from 194 passing to 570 passing. * fix: temporarily disable broken flux tests These tests expect rows to be stored in a certain order. However, nothing is specifying the sort order. This has been fixed in a later update to flux: (see 3d6f47ded). Temporarily disable these tests until we include a fixed version of the flux tests. * chore: add tests from `a492993012` This fixes "test-flux.sh" so it runs tests within the "flux/" directory. This uncovered some other issues with the tests located within "flux/". These also needed to be updated to match the newer flux API. * feat: upgrade flux to 0.172.0 This includes changes made in "cbbf4b27da". Since "test.go" in 2.x diverged from 1.x, some modifications were required to make this compatible. * feat: upgrade flux to 0.173.0 * feat: upgrade flux to v0.174.0 * fix: Update the condition when reseting cursor (#23522) Filters that contain `or` may change between cursor resets so we must remember to update the condition in the read cursor. ```flux \|> filter(fn: (r) => ((r["_field"] == "field1" and r["_value"]==true) or (r["_field"] == "field2" and r["_value"] == false))) ``` Closes https://github.com/influxdata/flux/issues/4804 * feat: upgrade flux to 0.174.1 * feat: upgrade flux to 0.175.0 * chore: remove end-to-end tests These were removed in `a492993` for 2.x. These tests prevent "go test ./..." from completing. As stated in the original commit, these tests should now be handled by the "fluxtest" harness. * feat: upgrade flux to 0.176.0 Some tests needed to be disabled within the flux harness. This is a result of enabling "Optimize Aggregate Window" in flux@05a1065f. These tests are not present in 2.x. Therefore, I am unsure if the breakage is resolved in a later commit. * feat: upgrade flux to 0.177.0 * feat: upgrade flux to 0.178.0 * feat: upgrade flux to v0.179.0 This removes all invocations of "flux.RegisterOpSpec". According to flux@e39096d5, "flux.RegisterOpSpec" does nothing in the current version of flux and was removed. * chore: update fluxtest skip list (#23633) * chore: manually backport `785a465e9a` This removes the reference to "flux.Spec". * build(flux): update flux to v0.181.0 (#23682) * build(flux): update flux to v0.184.2 * chore: skip more Flux acceptance tests There are issues for each skip detailed in test-flux.sh. * feat: upgrade flux to v0.185.0 This adds "FluxTesting" to the "HTTPD" configuration. This option is hidden and disabled by default. When "FluxTesting" is set, it enables the default testing flags for "Flux". These flags allow the "vectorized float tests" and tests requiring the "removeRedundantSortNodes" and "labelPolymorphism" flag enabled to work. These changes are based off of `d8553c002e`. flux@3d6f47ded is included within this version of Flux. Therefore we can now include the "group_" tests. feat: upgrade flux to 0.186.0 * feat: upgrade flux to 0.187.0 * feat: upgrade flux to 0.188.0 * fix: re-run ./generate.sh with updated protoc * fix: restrict cores to match CircleCI documentation Co-authored-by: davidby-influx <dbyrne@influxdata.com> Co-authored-by: Markus Westerlind <marwes91@gmail.com> Co-authored-by: Sean Brickley <sean@wabr.io> Co-authored-by: Jonathan A. Sternberg <jonathan@influxdata.com> Co-authored-by: Christopher M. Wolff <chris.wolff@influxdata.com>	2022-11-15 15:20:27 -05:00
Sam Arnold	9e9f1be574	fix: remove dead iterator (#23888 )	2022-11-09 16:24:01 -05:00
davidby-influx	cc26b7653c	fix: remove breaking argument validation for _fieldKeys iterator (#23875 ) New argument validation code for _fieldKeys system iterator broke Enterprise tests because it is misused all over the place. Back out the safety check.	2022-11-09 09:04:44 -08:00
davidby-influx	f5da0f50f4	fix: Optimize SHOW FIELD KEY CARDINALITY (#23871 ) Use the _fieldKeys system iterator closes https://github.com/influxdata/influxdb/issues/23840	2022-11-08 08:32:10 -08:00
davidby-influx	b17f27a5d9	fix: incorrect error message concatenation (#23729 )	2022-09-15 09:26:51 -07:00
davidby-influx	80c10c8c04	feat: optimize saving changes to fields.idx (#23701 ) Instead of writing out the complete fields.idx file when it changes, write out incremental changes that will be applied to the file on close and startup. closes https://github.com/influxdata/influxdb/issues/23653	2022-09-14 13:14:09 -07:00
davidby-influx	84c4f676b0	feat: add type conflict checker to influx_inspect (#23616 ) adds two commands "check-schema" and "merge-schema" to influx_inspect. These test for field type conflicts in all fields.idx beneath a directory and merges the derived schemas if "check-schema" has been run multiple times on different directories	2022-08-10 09:36:58 -07:00
davidby-influx	eb3cc88772	fix: generalize test for Windows (#23580 ) Also eliminate race condition in tests (cherry picked from commit `7e37a7ad16`)	2022-07-21 13:28:10 -07:00
davidby-influx	a8732dcf52	fix: restore in-memory Manifest on write error (#23552 ) Do not update the `FileSet` or `activeLogFile` field in the in-memory Partition structure if the Manifest file is not correctly saved to the disk. closes https://github.com/influxdata/influxdb/issues/23553	2022-07-20 12:59:15 -07:00
davidby-influx	25cea95beb	fix: add paths to tsi log and index file errors (#23557 ) Add paths to various TSI errors on opening and unmarshaling files to help poinpoint the corrupt files. Closes https://github.com/influxdata/influxdb/issues/23556	2022-07-19 09:02:20 -07:00
davidby-influx	061cf55f2a	fix: create TSI MANIFEST files atomically (#23539 ) When a MANIFEST file is created in TSI, it should be written to a temp file, then atomically renamed, to avoid overwriting the existing file only to fail on the later write. closes https://github.com/influxdata/influxdb/issues/23536	2022-07-13 10:11:49 -07:00
davidby-influx	a2dd708a26	fix: improve error messages opening index partitions (#23532 ) Where possible, add the file path path to any errors on opening, reading, (un)marshaling, or validating the various files comprising a partition closes https://github.com/influxdata/influxdb/issues/23506	2022-07-12 14:22:36 -07:00
davidby-influx	a428043f84	fix: lost TSI reference / close TagValueSeriesIDIterator in error case (#23461 ) (#23462 ) (cherry picked from commit `8bd4fc502d`) closes https://github.com/influxdata/influxdb/issues/23460 Co-authored-by: Dane Strandboge <dstrandboge@influxdata.com>	2022-06-16 11:54:04 -07:00
davidby-influx	54ac7e54ed	fix: remember shards that fail Open(), avoid repeated attempts (#23437 ) If a shard cannot be opened, store its ID and last error. Prevent future attempts to open during this invocation of influxDB. This information is not persisted. closes https://github.com/influxdata/influxdb/issues/23428 closes https://github.com/influxdata/influxdb/issues/23426	2022-06-13 10:32:47 -07:00
davidby-influx	d3db48e93d	fix: fully clean up partially opened TSI (#23430 ) When one partition in a TSI fails to open, all previously opened partitions should be cleaned up, and remaining partitions should not be opened closes https://github.com/influxdata/influxdb/issues/23427	2022-06-10 11:31:29 -07:00
davidby-influx	ec412f793b	fix: do not rename files on mmap failure (#23396 ) If NewTSMReader() fails because mmap fails, do not rename the file, because the error is probably caused by vm.max_map_count being too low closes https://github.com/influxdata/influxdb/issues/23172	2022-06-07 08:37:00 -07:00
davidby-influx	0ae0bd6e2e	fix: replace unprintable and invalid characters in errors (#23387 ) Replace unprintable and invalid characters with '?' in logged errors. Truncate consecutive runs of them to only 3 repeats of '?' closes https://github.com/influxdata/influxdb/issues/23386	2022-06-01 13:45:24 -07:00
Geoffrey Wossum	160cf678d5	fix: MeasurementsCardinality should not be less than 0 (#23286 ) Clamp the value of Store.MeasurementsCardinality so that it can not be less than 0. This primarily shows up as a negative `numMeasurements` value in /debug/vars under some circumstances. refs #23285	2022-04-21 13:32:12 -05:00
Dane Strandboge	0574163566	build: upgrade to go1.18 (#23250 )	2022-03-31 16:17:57 -05:00
davidby-influx	7d182158f4	fix: add database to MaxSeriesPerDatabase error message (#23113 ) To simplify debugging, print the database name when the max-series-per-database limit is exceeded in InMem indices. closes https://github.com/influxdata/influxdb/issues/23112	2022-02-08 11:52:14 -08:00
davidby-influx	f27df39c03	fix: add additional testing for MaxSeriesPerDatabase (#23094 ) Added test to ensure new code path taken for inmem index	2022-02-02 13:16:09 -08:00
davidby-influx	0c3dca883e	fix: correctly handle MaxSeriesPerDatabaseExceeded (#23091 ) Check for the correctly returned PartialWriteError in (*shard).validateSeriesAndFields, allow partial writes. closes https://github.com/influxdata/influxdb/issues/23090	2022-02-01 19:08:51 -08:00

1 2 3 4 5 ...

2780 Commits (gw_fix_load_log)