influxdb

Commit Graph

Author	SHA1	Message	Date
Geoffrey Wossum	037c6af6e8	feat: check for uncommitted WRR segments during startup (#25540 ) Check for uncommitted WRR segments during startup and abort startup if found. Closes: #25503	2024-11-14 15:27:01 -06:00
Geoffrey Wossum	5c7479eb14	chore: loadShards changes to more cleanly support 2.x feature (#25528 ) * chore: loadShards changes to more cleanly support 2.x feature (#25513) * chore: move shardID parsing and shard filtering into walkShardsAndProcess * chore: make it impossible to miss sending shardResponse or marking shard as complete * chore: always count number of shards (preparation for 2.x related feature) * chore: explicitly load series files and create indices serially Explicitly load series files and create indices serially. Also avoid passing them to work functions that don't need them. * chore: rework loadShards for changes necessary to cancel loading process * chore: comment improvements * fix: fix race conditions in TestStore_StartupShardProgress and TestStore_BadShardLoading * chore: avoid logging nil error * chore: refactor shard loading and shard walking Refactor loadShards and CreateShard to use a common shardLoader class that makes thread-safety easier. Refactor walkShardsAndProcess into findShards. * chore: improve comment * chore: rename OpenShard to ReopenShard and implement with shardLoader Rename Store.OpenShard to Store.ReopenShard and implement using a shardLoader object. Changes to tests as necessary. * chore: avoid resetting shard options and locking on Reopen Avoid resetting shard options when reopening a shard. Proper mutex locker in Shard.ReopenShard. * chore: fix formatting issue * chore: warn on mixed index types in Store.CreateShard * chore: change from info to warn when invalid shard IDs found in path * chore: use coarser locking in Store.ReopenShard * chore: fix typo in comment * chore: code simplification (cherry picked from commit `0bc167bbd7`) * chore: fix logging issues in Store.loadShards Fix reporting shards not opening correctly when they actually did. Fix race condition with logging in loadShards. (cherry picked from commit `65683bf166`) * chore: remove unnecessary fmt.Sprintf calls Remove unnecessary fmt.Sprintf calls for static code checks in main-2.x. (cherry picked from commit `8497fbf0af`) * chore: remove unnecessary blank identifier * chore: remove unnecessary blank identifier	2024-11-12 14:12:53 -06:00
WeblWabl	2ffb108a27	feat(logging): Add startup logging for shard counts (#25378 ) (#25507 ) * feat(logging): Add startup logging for shard counts (#25378) This PR adds a check to see how many shards are remaining vs how many shards are opened. This change displays the percent completed too. closes influxdata/feature-requests#476 (cherry picked from commit `3c87f52`) closes https://github.com/influxdata/influxdb/issues/25506	2024-11-01 09:20:35 -05:00
Geoffrey Wossum	96bade409e	feat: add option to flush WAL on shutdown (#25444 ) * feat: add option to flush WAL on shutdown Add `--storage-wal-flush-on-shutdown` to flush WAL on database shutdown. On successful shutdown, all WAL data will be committed to TSM files and the WAL directories will not contain any .wal files. Closes: #25422	2024-10-10 15:27:54 -05:00
Geoffrey Wossum	da9615fdc3	chore: improve error messages and logging during shard opening (#25331 ) Ported from master-1.x. (cherry picked from commit `23008e5286`) Closes: #25328	2024-09-13 16:59:17 -05:00
Geoffrey Wossum	cb8cfe3510	fix: prevent retention service from hanging (#25077 ) * fix: prevent retention service from hanging (#25055) Fix issue that can cause the retention service to hang waiting on a `Shard.Close` call. When this occurs, no other shards will be deleted by the retention service. This is usually noticed as an increase in disk usage because old shards are not cleaned up. The fix adds to new methods to `Store`, `SetShardNewReadersBlocked` and `InUse`. `InUse` can be used to poll if a shard has active readers, which the retention service uses to skip over in-use shards to prevent the service from hanging. `SetShardNewReadersBlocked` determines if new read access may be granted to a shard. This is required to prevent race conditions around the use of `InUse` and the deletion of shards. If the retention service skips over a shard because it is in-use, the shard will be checked again the next time the retention service is run. It can be deleted on subsequent checks if it is no longer in-use. If the shards is stuck in-use, the retention service will not be able to delete the shards, which can be observed in the logs for manual intervention. Other shards can still be deleted by the retention service even if a shard is stuck with readers. This is a port of ad68ec8 from master-1.x to main-2.x. closes: #25076 (cherry picked from commit `b4bd607eef`)	2024-06-24 12:27:22 -05:00
davidby-influx	2066c4be46	fix: improved shard deletion (#24602 ) (#24844 ) Avoid unnecessarily deleting series from the series file Log all errors on shard deletion Closes https://github.com/influxdata/influxdb/issues/24834 (cherry picked from commit `8ff06d5a92`) closes https://github.com/influxdata/influxdb/issues/24836	2024-03-26 14:18:08 -07:00
Jeffrey Smith II	8ad6e17265	chore: add additional error logging when deleting shard (#24038 ) * chore: add additional error logging when deleting shard * chore: better logging message	2023-02-09 09:10:25 -05:00
Jeffrey Smith II	2ad8995355	fix: improve delete speed when a measurement is part of the predicate (#23786 ) * fix: improve delete speed when a measurement is part of the predicate * test: add test for deleting measurement by predicate * chore: improve error messaging and capturing * chore: set goland to use the right formatting style	2022-10-14 15:09:32 -04:00
Abirdcfly	c433342830	chore: remove duplicate word in comments (#23685 ) Signed-off-by: Abirdcfly <fp544037857@gmail.com> Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2022-09-13 11:00:52 -05:00
davidby-influx	53580ead1d	fix: remember shards that fail Open(), avoid repeated attempts (#23437 ) (#23455 ) If a shard cannot be opened, store its ID and last error. Prevent future attempts to open during this invocation of influxDB. This information is not persisted. closes https://github.com/influxdata/influxdb/issues/23428 closes https://github.com/influxdata/influxdb/issues/23426 (cherry picked from commit `54ac7e54ed`) closes https://github.com/influxdata/influxdb/issues/23434 closes https://github.com/influxdata/influxdb/issues/23436	2022-06-14 13:01:11 -07:00
Geoffrey Wossum	30a9fd43f6	fix: MeasurementsCardinality should not be less than 0 (#23304 ) Clamp the value of Store.MeasurementsCardinality so that it can not be less than 0. This primarily shows up as a negative numMeasurements value in /debug/vars under some circumstances. refs #23285 (cherry picked from commit `160cf678d5`)	2022-04-26 23:37:09 -05:00
Dane Strandboge	82d1123e78	build: upgrade to Go 1.18.1 (#23252 )	2022-04-13 15:24:27 -05:00
Sam Arnold	e20b5e99a6	fix: remove nats for scraper processing (#23107 ) * fix: remove nats for scraper processing Scrapers now use go channels instead of NATS and interprocess communication. This should fix #23085 . Additionally, found and fixed #23106 . * chore: fix formatting * chore: fix static check and go.mod * test: fix some flaky tests * fix: mark NATS arguments as deprecated	2022-02-10 11:23:18 -05:00
Sam Arnold	b970e359dc	feat: remaining storage metrics from OSS engine (#22938 ) * fix: simplify disk size tracking * refactor: EngineTags in tsdb package * fix: fewer compaction buckets and dead code removal * feat: shard metrics * chore: formatting * feat: tsdb store metrics * feat: retention check metrics * chore: fix go vet * fix: review comments	2021-12-02 09:01:46 -05:00
Sam Arnold	dece95d1dd	feat: tsm compaction metrics via prometheus (#22904 ) * feat: tsm compaction metrics via prometheus * chore: fix formatting * chore: make activeCompactions a pointer	2021-11-19 14:51:22 -05:00
davidby-influx	88afa9229b	fix: detect misquoted tag values and return an error (#22754 ) (#22785 ) SHOW TAG KEYS FROM "foo" where bar="misquoted" is erroneous, because the tag value must be enclosed in single, not double, quotes. Although this correctly returns no tag keys, it is very inefficient and has cause out-of-memory failures at a customer. This fix short-circuits the query. closes https://github.com/influxdata/influxdb/issues/22755 (cherry picked from commit `af9e89a4d4`) closes https://github.com/influxdata/influxdb/issues/22757	2021-10-27 21:32:11 -07:00
William Baker	1f66b3110e	fix: upgrade influxql to latest version & fix predicate handling for show tag values metaqueries (#22500 ) * feat: Add WITH KEY to show tag keys * fix: add tests for multi measurement tag value queries * chore: fix linter problems * chore: revert influxql changes to keep WITH KEY disabled * chore: add TODO for moving flux tests to flux repo Co-authored-by: Sam Arnold <sarnold@influxdata.com>	2021-09-17 11:14:03 -06:00
William Baker	ec7841b355	feat: support for flux cardinality query (#22441 ) * feat: works with custom iterator * feat: works with existing iterators * chore: cleanup * test: consistent assertions for tests * fix: better log message if trying to filter on the value of a field key * fix: comment for handling boolean literal; handle false boolean as well * fix: make time range checking inclusive	2021-09-13 13:20:56 -06:00
Sam Arnold	5015297d40	fix: more expressive errors (#22448 ) * fix: more expressive errors Closes #22446 * fix: server only logging for untyped errors * chore: fix formatting	2021-09-13 15:12:35 -04:00
davidby-influx	dd34f5fd9d	chore: add more logging tsdb.Engine.IsIdle and tsdb.Engine.Digest now return a reason string for why the engine & shard are not idle. Callers can then use this string for logging, if desired. The returned reason does not allocate memory, so the caller may want to add the shard ID and path for more information in the log. This is intended to be used in calls from the anti-entropy service in Enterprise. (cherry picked from commit `bf45841359`) fixes https://github.com/influxdata/influxdb/issues/21448 (cherry picked from commit `c8da9bafbf`) closes https://github.com/influxdata/influxdb/issues/21894	2021-07-20 11:57:52 -07:00
Daniel Moran	d747e7ec4e	feat: add config parameters to toggle WAL concurrency and timeouts (#21621 ) * feat: add context parameter to Take() method on fixed limiter * refactor: plumb context through to uses of Take() * test: update tests to pass context as needed * feat: add config toggles for setting WAL write concurrency & timeout	2021-06-09 11:03:53 -04:00
Daniel Moran	00420fb54c	fix(influxql): make meta queries respect query timeout (#21545 ) Co-authored-by: davidby-influx <dbyrne@influxdata.com>	2021-05-24 21:10:53 -04:00
Daniel Moran	00afd95cb7	refactor: automated move of errors and id from root to kit (#21101 ) Co-authored-by: Sam Arnold <sarnold@influxdata.com>	2021-03-30 14:10:02 -04:00
Daniel Moran	743aef4a98	fix(tsdb): allow backups during snapshotting, and don't leak tmp files (#20527 ) Co-authored-by: davidby-influx <dbyrne@influxdata.com>	2021-01-18 19:02:26 -08:00
Daniel Moran	9aefa6f868	fix(tsdb): never use an inmem index (#20313 ) And fix the logging setup for the TSDB storage engine	2020-12-23 07:46:57 -08:00
Ben Johnson	7dafc2cf34	feat(tsdb): Implement delete with predicate.	2020-12-02 14:55:02 -07:00
Ben Johnson	419b0cf76b	feat: Implement full restore	2020-11-05 10:05:01 -07:00
Ben Johnson	5f1968b331	fix: Skip deleted shard groups during backup	2020-11-05 10:05:01 -07:00
Ben Johnson	ea1a3dbe60	fix: Return ENotFound for BackupShard()	2020-11-05 10:05:01 -07:00
Brett Buddin	b917d8d9b0	chore(influxdb): Placate the linter.	2020-08-27 15:46:32 -04:00
Stuart Carnie	dee8977d2c	chore: move v2/v1/tsdb → v2/tsdb	2020-08-26 10:46:47 -07:00
Mark Rushakoff	f2898d1992	Wipe out workspace in preparation for v2 merge "Knock knock." "Who's there?" "InfluxDB Veet." ...	2019-01-11 10:38:50 -08:00
Jeff Wendling	9f0cd683b9	Merge pull request #10516 from influxdata/jmw-conflict-concurrency tsdb: conflict based concurrency resolution	2018-11-29 14:14:24 -07:00
Ben Johnson	298eddb82c	Skip and warn series files in retention policy directory.	2018-11-28 11:20:18 -07:00
Jeff Wendling	4cad51a604	tsdb: conflict based concurrency resolution There are some problematic races that occur when deletes happen against writes to the same points at the same time. This change introduces guards and an epoch based system to coordinate these modifications. A guard matches a point based on the time, measurement name, and some conditions loaded from an influxql expression. The intent is to be as precise as possible without allowing any false neagatives: if a point would be deleted, the guard must match it. We are allowed to match more points than necessary, at the cost of slowing down writes. The epoch based system keeps track of outstanding writes and deletes and their associated guards. When a delete operation is going to start, it waits until all current writes are done, and installs its guard, blocking all future writes that contain points that may conflict with the delete. This allows writes to disjoint points to proceed uncontended, and the implementation is optimized for assuming there are few outstanding deletes. For example, in the case that there are no deletes, a write just has to take a mutex, bump a counter, and compare a value against zero. The epoch trackers are per shard, so that different shards never have to contend with one another.	2018-11-21 19:19:53 -07:00
Jeff Wendling	030adf4bd5	tsdb: don't allow deletes to a database in mixed index mode TSI1 and inmem indexes have different properties during deletes. Specifically, inmem shares a global index across all shards, where every tsi1 index is contained to a specific shard. When deleting a series, it may cause the last reference to the series across all shards to be dropped, necessitating a removal from the series file. Since the inmem index shares the index across all shards, removing the series when it's removed from the series file is sufficient. However, in the case of a mixed index database, if the last shard is a TSI1 shard, the other inmem indexes are not available when we discover that it was the last reference to the series. This ends up leaving the series in the inmem index without a series id in the series file, causing all sorts of misbehavior. Rather than continue curling ourselves into a ball to try to fix this unsupported mode, give a helpful error message to the user that they must run their database in a non-mixed index mode to allow deletes.	2018-11-21 18:18:38 -07:00
Edd Robinson	cade59e253	Fix panic in IndexSet This commit fixes a panic where a concurrent removal of a shard and meta query could cause a `nil` index to be added to the IndexSet`.	2018-10-26 18:23:54 +01:00
Stuart Carnie	9520b8d956	fix(tsdb): Fix race calling filterShards outside a lock Move filterShards inside the lock, as it enumerates the shards map, which can result in data race when the map is written concurrently.	2018-10-17 14:14:53 -07:00
Edd Robinson	f52de2d1e7	Ensure orphaned series removed from inmem index This commit ensures that any orphaned series (series that are to be removed and no longer are referenced anywhere in the database) are removed from the `inmem` index when a shard is dropped.	2018-08-21 15:00:35 +01:00
Edd Robinson	dece5b847f	Refactor index names	2018-08-21 14:32:30 +01:00
Jacob Marble	786d637780	tsdb: Cleanup compaction throughput code	2018-08-07 11:12:41 -07:00
Zach Goldstein	0ef3752a1a	Add configuration parameter to expose rate limit for TSM compaction. Closes: 9938	2018-08-07 10:05:36 -04:00
Edd Robinson	9eece563b1	Simplify loops	2018-08-05 15:16:33 +01:00
Jeff Wendling	63fbf53699	Merge pull request #10063 from influxdata/jmw-extra-log-context Make store include context in logs	2018-07-18 11:53:22 -06:00
Edd Robinson	95db829631	Remove default max concurrent compaction limit PR #9204 introduced a maximum default concurrent compaction limit of 4. The idea was to reduce IO utilisation on large systems with many cores, and high write load. Often on these systems, disks were not scaled appropriately to to the write volume, and while the write path could keep up, compactions would saturate disks. In #9225 work was done to reduce IO saturation by limiting the compaction throughput. To some extent, both #9204 and #9225 work towards solving the same problem. We have recently begun to notice larger clusters to suffer from situations where compactions are not keeping up because they have been scaled up, but the limit of 4 has stayed in place. While users can manually override the setting, it seems more user friendly if we remove the limit by default, and set it manually in cases where compactions are causing too much IO on large boxes.	2018-07-18 17:27:49 +01:00
Edd Robinson	55ffeb563a	Tidy up logging of compaction settings	2018-07-18 17:26:34 +01:00
Jeff Wendling	7bdbe26534	Make store include context in logs If some error or message is in the context of some shard or database be sure to include it in the message.	2018-07-18 10:22:53 -06:00
David Norton	6016a80997	allow tag keys to contain underscores	2018-07-17 09:39:08 -04:00
Stuart Carnie	88cd9f3fcf	pr(influx-tools): Improvements per PR review	2018-06-13 10:29:59 -07:00

1 2 3 4 5 ...

328 Commits (db/update-protos)