influxdb

Commit Graph

Author	SHA1	Message	Date
devanbenz	703f16a602	chore: Only do debug logs if there are compaction groups	2025-08-01 13:20:35 -05:00
devanbenz	a86441180f	chore: adds logging to limiters prior to apply	2025-08-01 13:06:53 -05:00
devanbenz	fc54ef3272	chore: additional logging of groups	2025-08-01 13:04:05 -05:00
devanbenz	3fbd6f1f52	chore: Adding groups as keys for logs	2025-08-01 13:00:12 -05:00
devanbenz	5f07da3798	chore: Add additional logging around scheduler loop	2025-08-01 12:42:11 -05:00
devanbenz	aa4698e61c	chore: Adding additional trace logging	2025-08-01 12:23:19 -05:00
devanbenz	80902585a8	chore: Adjust logging and add groups	2025-08-01 12:12:45 -05:00
devanbenz	17a8c15c1d	chore: Adding debug level logging for engine.go compaction	2025-08-01 12:09:43 -05:00
Jamie Strandboge	40ec5b01a1	chore(deps): bump golang.org/x/oauth2 from v0.21.0 to 0.27.0 (#26625 )	2025-07-25 11:21:49 -05:00
WeblWabl	0f57087944	feat: Adds LastModifiedOrErr to expose error for LastModified (#26623 )	2025-07-24 20:54:41 -05:00
Phil Bracikowski	4e8a3b389b	feat: file store merge metrics (#26615 ) * feat(1.x,file_store): port metrics for merge work This commit ports metrics around merging tsm blocks when executing a query. These will appear in EXPLAN ANALYZE results. The new information records the time spent merging blocks, the number of blocks merged, roughly the number of values merged into the first block of each ReadBlock call, and the number of times that single calls to ReadBlock have more than 4 block merges. The multiblock merge is sequential and might benefit from a tree merge algorithm. The latter stat helps identify if the engineering effort would be fruitful. * closes #26614 * chore: switch to a timer for duration printing of times * chore: rename method * fix: avoid race and use new atomic primitive	2025-07-18 12:18:37 -07:00
Phil Bracikowski	1c082def6c	feat(influx_tools): report more than one error type (#26600 ) Without this PR, the export-parquet tool would report on type conflict errors and not name conflict errors in the schema if type conflicts were encountered first. It stopped checking for validation issues once type conflicts were found. This PR changes it so that both type and name schema issues are both identified and reported in the commands output. Either still fails an export to parquet; but in --dry-run mode the validation is an useful tool to check for schemas that will be an issue in parquet of influxdbv3. * follows #25297	2025-07-11 15:43:28 -07:00
WeblWabl	57da7aa4e7	feat: Adds time_format param for httpd (#26596 ) * feat: Adds time_format param for httpd * This PR will add a time_format parameter which takes in the value "epoch" or "rfc3339". The default will be "epoch" depending on the value output timestamps will be formatted in epoch or rfc3339. Closes FR#615 * feat: Adding some changes * error if incorrect param * update naming for converter function * combine tests * chore: fmt'ing * feat: A few modifications * Rename convertToRfc3339Nano to convertToTimeFormat * allow time formatting to be passed as parameter * adjust error handling to use already defined timeFormats * merge test data for test cases to reduce boilerplate	2025-07-10 16:48:59 -07:00
davidby-influx	ea36c5ff47	chore: improve logging on compaction failures (#26545 ) Streamline compaction logging, while providing more information to debug remnant temporary files.	2025-06-25 13:54:52 -07:00
WeblWabl	149fb47597	feat: Defer cleanup for log/index compactions, add debug log (#26511 ) I believe that there is something happening which causes CurrentCompactionN() to always be greater than 0. Thus making Partition.Wait() hang forever. Taking a look at some profiles where this issue occurs. I'm seeing a consistent one where we're stuck on Partition.Wait() ``` -----------+------------------------------------------------------- 1 runtime.gopark runtime.chanrecv runtime.chanrecv1 github.com/influxdata/influxdb/tsdb/index/tsi1.(Partition).Wait github.com/influxdata/influxdb/tsdb/index/tsi1.(Partition).Close github.com/influxdata/influxdb/tsdb/index/tsi1.(Index).close github.com/influxdata/influxdb/tsdb/index/tsi1.(Index).Close github.com/influxdata/influxdb/tsdb.(Shard).closeNoLock github.com/influxdata/influxdb/tsdb.(Shard).Close github.com/influxdata/influxdb/tsdb.(Store).DeleteShard github.com/influxdata/influxdb/services/retention.(Service).DeletionCheck.func3 github.com/influxdata/influxdb/services/retention.(Service).DeletionCheck github.com/influxdata/influxdb/services/retention.(Service).run github.com/influxdata/influxdb/services/retention.(*Service).Open.func1 -----------+------------------------------------------------------- ``` Defer'ing compaction count cleanup inside goroutines should help with any hanging current compaction counts. Modify currentCompactionN to be a sync atomic. Adding a debug level log within Compaction.Wait() should aid in debugging.	2025-06-20 13:18:47 -05:00
Geoffrey Wossum	4378e85744	chore: stop publishing nightly changelog (#26539 ) Stop publishing nightly changelog since we do not publish nightly build artifacts. This addresses issues with dependent projects that check status of CI for influxdb. Closes: #26538	2025-06-18 14:15:00 -05:00
Geoffrey Wossum	8ef2aca1ca	fix: stop noisy logging about phantom shards that do not belong to node (#26527 ) Stop noisy logging about phantom shards that do not belong to the current node by checking the shard ownership before logging about the phantom shard. Note that only the logging was inaccurate. This did not accidentally remove shards from the metadata that weren't really phantom shards due to checks in `DropShardMetaRef` implementations. closes: #26525	2025-06-17 09:40:33 -05:00
WeblWabl	7437f275ff	feat: Add new logging for compaction level 5 and remove bug with opt holdoff time (#26488 ) Previously ```go // StartOptHoldOff will create a hold off timer for OptimizedCompaction func (e *Engine) StartOptHoldOff(holdOffDurationCheck time.Duration, optHoldoffStart time.Time, optHoldoffDuration time.Duration) { startOptHoldoff := func(dur time.Duration) { optHoldoffStart = time.Now() optHoldoffDuration = dur e.logger.Info("optimize compaction holdoff timer started", logger.Shard(e.id), zap.Duration("duration", optHoldoffDuration), zap.Time("endTime", optHoldoffStart.Add(optHoldoffDuration))) } startOptHoldoff(holdOffDurationCheck) } ``` was not passing the data by reference which meant we were never modifying the `optHoldoffDuration` and `optHoldoffStart` vars. This PR also adds additional logging to Optimized level 5 compactions to clear up a little bit of confusion around log messages.	2025-06-02 17:51:59 -05:00
Sven Rebhan	c07e237142	feat(influx_tools): Add export to parquet files (#25297 ) Adds a command to export data into per-shard parquet files. To do so, the command iterates over the shards, creates a cumulative schema over the series of a measurement (i.e. a super-set of tags and fields) and exports the data to a parquet file per measurement and shard.	2025-06-02 10:59:54 -07:00
Geoffrey Wossum	1fbe319080	fix: reduce excessive CPU usage during compaction planning (#26432 ) Co-authored-by: devanbenz <devandbenz@gmail.com>	2025-05-27 16:55:20 -05:00
davidby-influx	eab8a8a6e8	fix: add locking in ClearBadShardList (#26423 )	2025-05-19 09:14:07 -07:00
Geoffrey Wossum	66f4dbeaad	fix: limit number of concurrent optimized compactions (#26319 ) Limit number of concurrent optimized compactions so that level compactions do not get starved. Starved level compactions result in a sudden increase in disk usage. Add [data] max-concurrent-optimized-compactions for configuring maximum number of concurrent optimized compactions. Default value is 1. Co-authored-by: davidby-influx <dbyrne@influxdata.com> Co-authored-by: devanbenz <devandbenz@gmail.com> Closes: #26315	2025-05-06 15:42:39 -05:00
davidby-influx	62e803e673	feat: improve dropped point logging (#26257 ) Log the reason for a point being dropped, the type of boundary violated, and the time that was the boundary. Prints the maximum and minimum points (by time) that were dropped closes https://github.com/influxdata/influxdb/issues/26252 * fix: better time formatting and additional testing * fix: differentiate point time boundary violations * chore: clean up switch statement * fix: improve error messages	2025-04-18 15:18:19 -07:00
Jamie Strandboge	f61a082618	chore: update to go 1.23.8 (#26293 )	2025-04-18 13:53:04 -05:00
Jamie Strandboge	58475a1b36	chore: use github.com/golang-jwt/jwt/v4 and update golang.org/x/net to v0.38.0 (1.x) (#26292 ) * chore: update to supported github.com/golang-jwt/jwt/v4 * chore(dep): update golang.org/x/net to v0.38.0	2025-04-18 13:52:55 -05:00
davidby-influx	53329a3ad3	feat: use zap.AtomicLevel for dynamic logging levels (#26182 ) Use the zap.AtomicLevel struct for log levels which allows the level to be changed dynamically. Enterprise will use this feature.	2025-04-17 10:07:33 -07:00
WeblWabl	8358f1beb9	fix: Modify package publishing to fix slack msg & publish_packages (#26279 )	2025-04-16 15:55:57 -05:00
WeblWabl	96e44cac73	fix: PlanOptimize is running too frequently (#26211 ) PlanOptimize is being checked far too frequently. This PR is the simplest change that can be made in order to ensure that PlanOptimize is not being ran too much. To alleviate the frequency I've added a lastWrite parameter to PlanOptimize and added an additional test that mocks the edge cause out in the wild that led to this PR. Previously in test cases for PlanOptimize I was not checked to see if certain cases would be picked up by Plan I've adjusted a few of the existing test cases after modifying Plan and PlanOptimize to have the same lastWrite time.	2025-04-08 12:22:29 -05:00
Geoffrey Wossum	61f21c5adb	chore(ci): push artifiacts to public bucket (#26190 ) * chore(ci): push artifacts to public bucket (#25435) Clean cherry-pick of #25435 to master-1.x. (cherry picked from commit `ca80b243ed`) * chore: port #24491 to master-1.x Port a portion of #24491 that was not included in previous cherry-picks to master-1.x	2025-03-25 12:31:31 -05:00
WeblWabl	77d6f20894	feat: Upgrade influxql to v1.4.1 (#26181 )	2025-03-21 12:24:38 -05:00
WeblWabl	6cda9c903e	fix: Remove nil dereference (#26154 )	2025-03-18 08:11:22 -05:00
davidby-influx	9e00f0de98	fix: do not panic on invalid multiple subqueries (#26143 ) Multiple subqueries in a FROM clause caused a panic, insead of returning an error because they are syntactically invalid. This corrects that problem closes https://github.com/influxdata/influxdb/issues/26139	2025-03-14 13:38:57 -07:00
WeblWabl	d8bcbd894c	feat: Add CompactPointsPerBlock config opt (#26100 ) * feat: Add CompactPointsPerBlock config opt This PR adds an additional parameter for influxd CompactPointsPerBlock. It adjusts the DefaultAggressiveMaxPointsPerBlock to 10,000. We had discovered that with the points per block set to 100,000 compacted TSM files were increasing. After modifying the points per block to 10,000 we noticed that the file sizes decreased. The value has been set as a parameter that can be adjusted by administrators this allows there to be some tuning if compression problems are encountered.	2025-03-05 14:59:06 -06:00
davidby-influx	2ab5aad52e	chore: add logging to Filestore.purger (#26089 ) Also fixes error type checks in TestCompactor_CompactFull_InProgress	2025-03-05 11:46:07 -08:00
davidby-influx	1efb8dad43	fix: remove temp files on error in Compactor.writeNewFiles (#26074 ) Compactor.writeNewFiles should delete temporary files created on iterations before an error halts the compaction. closes https://github.com/influxdata/influxdb/issues/26073	2025-02-27 08:17:48 -08:00
davidby-influx	ba95c9b0f0	fix: ensure temp files removed on failed compaction (#26070 ) Add more robust temporary file removal on a failed compaction. Don't halt on a failed removal, and don't assume a failed compaction won't generate temporary files. closes https://github.com/influxdata/influxdb/issues/26068	2025-02-26 13:17:17 -08:00
davidby-influx	083b679b56	fix: ensure fields in memory match on disk A field could be created in memory but not saved to disk if a later field in that point was invalid (type conflict, too big) Ensure that if a field is created, it is saved.	2025-02-24 13:53:40 -08:00
WeblWabl	03b6ed2bed	feat: Upgrade flux to v0.196.1 (#26041 ) * feat: update flux to 0.196.1 * feat: Update proto files This updates from protoc-gen-go v1.33.0 -> v1.34.1 and protoc from v5.26.1 -> v5.29.2	2025-02-20 13:46:06 -06:00
davidby-influx	5f576331d3	chore: refactor field creation for maintainability Address review comments in the port work of the field creation. Also fixes one bug in returning the wrong error.	2025-02-18 14:00:11 -08:00
davidby-influx	b617eb24a7	fix: switch MeasurementFields from atomic.Value to sync.Map (#26022 ) Simplify and speed up synchronization for MeasurementFields structures by switching from a mutex and atomic.Value to a sync.Map	2025-02-13 16:53:25 -08:00
davidby-influx	5a20a835a5	fix: lock MeasurementFields while validating (#25998 ) There was a window where a race between writes with differing types for the same field were being validated. Lock the MeasurementFields struct during field validation to avoid this. closes https://github.com/influxdata/influxdb/issues/23756	2025-02-13 11:33:34 -08:00
WeblWabl	4ad5e2aba7	feat: Add error join for file writing in snapshots (#26004 ) This PR adds an error join to help with handling multiple errors from snapshot file writers.	2025-02-12 15:06:43 -06:00
WeblWabl	306a184a8d	feat: Add error joins/returns (#25996 ) This pr adds err handling for branch that did not specify os file removal errors previously. This is part of EAR #5819.	2025-02-11 12:15:25 -06:00
davidby-influx	f54a34ae33	fix: actually call the deferred function (#25952 )	2025-01-31 15:42:38 -08:00
WeblWabl	edf5ff20f6	feat: updates go to 1.23.5 (#25926 ) * feat: updates go to 1.23.5 and gosnowflake to 1.9.0	2025-01-28 13:31:31 -06:00
davidby-influx	800970490a	fix: move aside TSM file on errBlockRead (#25839 ) The error type check for errBlockRead was incorrect, and bad TSM files were not being moved aside when that error was encountered. Use errors.Join, errors.Is, and errors.As to correctly unwrap multiple errors. Closes https://github.com/influxdata/influxdb/issues/25838	2025-01-22 10:46:31 -08:00
WeblWabl	f04105bede	feat: Modify optimized compaction to cover edge cases (#25594 ) * feat: Modify optimized compaction to cover edge cases This PR changes the algorithm for compaction to account for the following cases that were not previously accounted for: - Many generations with a groupsize over 2 GB - Single generation with many files and a groupsize under 2 GB - Where groupsize is the total size of the TSM files in said shard directory. - shards that may have over a 2 GB group size but many fragmented files (under 2 GB and under aggressive point per block count) closes https://github.com/influxdata/influxdb/issues/25666	2025-01-14 14:51:09 -06:00
WeblWabl	e2d76edb40	feat: expose NewEncoder from logging package (#25710 ) * feat: This PR exposes NewEncoder from our internal logger package	2025-01-14 12:15:17 -06:00
mwdmwd	7999835ac3	feat: influx_inspect export from a single tsm file (#25530 ) * feat: This PR adds -tsm file flag to export Adds the ability to use influx_inspect export to export data from a single tsm file, for example influx_inspect export -out - -tsmfile 000000006-000000002.tsm.bad -database thermo -retention autogen.	2025-01-13 13:48:35 -06:00
davidby-influx	e974165d25	fix: do not leak file handles from Compactor.write (#25725 ) There are a number of code paths in Compactor.write which on error can lead to leaked file handles to temporary files. This, in turn, prevents the removal of the temporary files until InfluxDB is rebooted, releasing the file handles. closes https://github.com/influxdata/influxdb/issues/25724	2025-01-03 14:43:41 -08:00

1 2 3 4 5 ...

14918 Commits (db/6263/compaction-debug-logging) All Branches Search

14918 Commits (db/6263/compaction-debug-logging)

All Branches