Commit Graph

139 Commits (2.7)

Author SHA1 Message Date
WeblWabl 5900fcfb17
fix: ensure fields in memory match on disk (#26402)
Co-authored-by: davidby-influx <72418212+davidby-influx@users.noreply.github.com>
closes https://github.com/influxdata/influxdb/issues/26035
2025-05-12 15:11:38 -05:00
WeblWabl eb1dd04706
fix: prevent differing field types in the same shard (#26025) (#26403)
Co-authored-by: davidby-influx <72418212+davidby-influx@users.noreply.github.com>
fix: lock MeasurementFields while validating (#25998)
closes https://github.com/influxdata/influxdb/issues/23756
fix: switch MeasurementFields from atomic.Value to sync.Map (#26022)
closes https://github.com/influxdata/influxdb/issues/26001
2025-05-12 12:52:23 -05:00
WeblWabl 68534d3ff2
feat: upgrade go to 1.23.5 (#25925) (#25928)
* feat: This PR updates the go toolchain from 1.22.11 to 1.23.5

(cherry picked from commit 73722a5b66)
2025-01-28 13:46:01 -06:00
Geoffrey Wossum e9e0f744fa
fix: prevent retention service from hanging (#25121)
Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.

The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.

If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.

This is a port of ad68ec8 from master-1.x to main-2.x.

closes: #25118
(cherry picked from commit b4bd607eef)
(cherry picked from commit cb8cfe3510)
2024-07-01 14:43:20 -05:00
davidby-influx 535e86965a
fix: avoid SIGBUS when reading non-std series segment files (#24509) (#24520) (#24521)
Some series files which are smaller than the standard
sizes cause SIGBUS in influx_inspect and influxd, because
entry iteration walks onto mapped memory not backed by the
the file.  Avoid walking off the end of the file while
iterating series entries in oddly sized files.

closes https://github.com/influxdata/influxdb/issues/24508

Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
(cherry picked from commit 969abf3da2)

closes https://github.com/influxdata/influxdb/issues/24511

(cherry picked from commit 081f95147e)

closes https://github.com/influxdata/influxdb/issues/24512
2023-12-20 09:52:57 -08:00
Eng Zer Jun 903d30d658
test: use `T.TempDir` to create temporary test directory (#23258)
* test: use `T.TempDir` to create temporary test directory

This commit replaces `os.MkdirTemp` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.

Prior to this commit, temporary directory created using `os.MkdirTemp`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
	defer func() {
		if err := os.RemoveAll(dir); err != nil {
			t.Fatal(err)
		}
	}
is also tedious, but `t.TempDir` handles this for us nicely.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestSendWrite on Windows

=== FAIL: replications/internal TestSendWrite (0.29s)
    logger.go:130: 2022-06-23T13:00:54.290Z	DEBUG	Created new durable queue for replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestSendWrite1627281409\\001\\replicationq\\0000000000000001"}
    logger.go:130: 2022-06-23T13:00:54.457Z	ERROR	Error in replication stream	{"replication_id": "0000000000000001", "error": "remote timeout", "retries": 1}
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestSendWrite1627281409\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestStore_BadShard on Windows

=== FAIL: tsdb TestStore_BadShard (0.09s)
    logger.go:130: 2022-06-23T12:18:21.827Z	INFO	Using data dir	{"service": "store", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestStore_BadShard1363295568\\001"}
    logger.go:130: 2022-06-23T12:18:21.827Z	INFO	Compaction settings	{"service": "store", "max_concurrent_compactions": 2, "throughput_bytes_per_second": 50331648, "throughput_bytes_per_second_burst": 50331648}
    logger.go:130: 2022-06-23T12:18:21.828Z	INFO	Open store (start)	{"service": "store", "op_name": "tsdb_open", "op_event": "start"}
    logger.go:130: 2022-06-23T12:18:21.828Z	INFO	Open store (end)	{"service": "store", "op_name": "tsdb_open", "op_event": "end", "op_elapsed": "77.3µs"}
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestStore_BadShard1363295568\002\data\db0\rp0\1\index\0\L0-00000001.tsl: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestPartition_PrependLogFile_Write_Fail and TestPartition_Compact_Write_Fail on Windows

=== FAIL: tsdb/index/tsi1 TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s)
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_PrependLogFile_Write_Failwrite_MANIFEST656030081\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process.
    --- FAIL: TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s)

=== FAIL: tsdb/index/tsi1 TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s)
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_Compact_Write_Failwrite_MANIFEST3398667527\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process.
    --- FAIL: TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s)

We must close the open file descriptor otherwise the temporary file
cannot be cleaned up on Windows.

Fixes: 619eb1cae6 ("fix: restore in-memory Manifest on write error")
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestReplicationStartMissingQueue on Windows

=== FAIL: TestReplicationStartMissingQueue (1.60s)
    logger.go:130: 2023-03-17T10:42:07.269Z	DEBUG	Created new durable queue for replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"}
    logger.go:130: 2023-03-17T10:42:07.305Z	INFO	Opened replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"}
    testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestReplicationStartMissingQueue76668607\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: update TestWAL_DiskSize

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestWAL_DiskSize on Windows

=== FAIL: tsdb/engine/tsm1 TestWAL_DiskSize (2.65s)
    testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestWAL_DiskSize2736073801\001\_00006.wal: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

---------

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-03-21 16:22:11 -04:00
davidby-influx b72848d436
feat: optimize saving changes to fields.idx (#23701) (#23728)
Instead of writing out the complete fields.idx
file when it changes, write out incremental
changes that will be applied to the file on
close and startup.

closes https://github.com/influxdata/influxdb/issues/23653

(cherry picked from commit 80c10c8c04)

closes https://github.com/influxdata/influxdb/issues/23703
2022-09-15 12:15:14 -07:00
Abirdcfly c433342830
chore: remove duplicate word in comments (#23685)
Signed-off-by: Abirdcfly <fp544037857@gmail.com>

Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-09-13 11:00:52 -05:00
Dane Strandboge 82d1123e78
build: upgrade to Go 1.18.1 (#23252) 2022-04-13 15:24:27 -05:00
Andrew Charlton 4e08604e48
feat: Add MeasurementNames method to MeasurementFieldSet (#23173) 2022-03-15 10:21:38 +00:00
Sam Arnold 5015297d40
fix: more expressive errors (#22448)
* fix: more expressive errors

Closes #22446

* fix: server only logging for untyped errors

* chore: fix formatting
2021-09-13 15:12:35 -04:00
Daniel Moran d747e7ec4e
feat: add config parameters to toggle WAL concurrency and timeouts (#21621)
* feat: add context parameter to Take() method on fixed limiter
* refactor: plumb context through to uses of Take()
* test: update tests to pass context as needed
* feat: add config toggles for setting WAL write concurrency & timeout
2021-06-09 11:03:53 -04:00
davidby-influx d10a727157
fix: avoid rewriting fields.idx unnecessarily (#21592) (#21610)
Under heavy write load creating new fields and measurements
the rewrite of the fields.idx file is a bottleneck. This
enhancement combines multiple writes into a single one and
shares any error return value with all of the combined
invocations. MeasurementFieldSet and the new
MeasurementFieldSetWriter must both now be explicitly
closed.

Closes #21577

(cherry picked from commit f64be286be)

Closes https://github.com/influxdata/influxdb/issues/21598
2021-06-04 13:17:53 -07:00
Daniel Moran 7b1763e791
fix(tsdb): minimize lock contention when adding new fields or measurements (#21228)
fields.idx frequent writes cause lock contention and fields.idx is recreated
when a field or measurement is added in a WritePointsWithContext()
This eliminates locking during the actual file rewrite, and limits it to
the times when the MeasurementFieldSet is actually being read or written
in memory and when the new file is being renamed.

Test verification of correct behavior by checking the fields.idx
file matches the in-memory copy after heavily parallel measurement addition.


Co-authored-by: davidby-influx <72418212+davidby-influx@users.noreply.github.com>
2021-04-15 14:08:28 -04:00
Daniel Moran 727a7b58c1
test: replace influxlogger with zaptest logger (#20589) 2021-02-11 10:12:39 -05:00
Daniel Moran 743aef4a98
fix(tsdb): allow backups during snapshotting, and don't leak tmp files (#20527)
Co-authored-by: davidby-influx <dbyrne@influxdata.com>
2021-01-18 19:02:26 -08:00
Daniel Moran 9aefa6f868
fix(tsdb): never use an inmem index (#20313)
And fix the logging setup for the TSDB storage engine
2020-12-23 07:46:57 -08:00
Brett Buddin b917d8d9b0
chore(influxdb): Placate the linter. 2020-08-27 15:46:32 -04:00
Stuart Carnie dee8977d2c
chore: move v2/v1/tsdb → v2/tsdb 2020-08-26 10:46:47 -07:00
Mark Rushakoff f2898d1992 Wipe out workspace in preparation for v2 merge
"Knock knock."

"Who's there?"

"InfluxDB Veet."

...
2019-01-11 10:38:50 -08:00
Tanya Gordeeva 8b8421049e tsdb: benchmark for many fields 2018-11-02 18:49:28 -07:00
Tanya Gordeeva f13a1293f2 tsdb/shard_test: add comparitive benchmarks for measurement cardinalities
Reuses some existing benchmarks, but ensuring that we write equal numbers of
points for comparison.
2018-11-02 18:49:17 -07:00
Edd Robinson dece5b847f Refactor index names 2018-08-21 14:32:30 +01:00
Edd Robinson 80dc07cbcb Efficient means of getting fields for measurement
If it's known that the read request only needs to use a single
measurement, then we can avoid the need to get field keys via the query
engine.

However, that means that a new method of getting the field keys for a
measurement would be needed. This commit exposes a method to efficiently
get field key names for a measurement across multiple shards.

name
2018-07-18 12:21:54 +01:00
Edd Robinson 0060b83644 create iterator benchmark 2018-07-02 16:47:44 +01:00
Jeff Wendling d55979450a Fix shard benchmarks
at some point, the Inmem field on the engine options became
required, but the benchmarks weren't updated.

also uses filepath everywhere when manipulating file paths.
2018-04-23 12:39:24 -06:00
Edd Robinson c1e1412dae Don't panic when checking for field 2018-03-12 15:25:20 +00:00
Edd Robinson ef5e3a09cd Tidy up test initialisation 2018-01-29 15:01:31 +00:00
Edd Robinson 4ccb6ada69 Remove unused code/cleanup tsdb package 2018-01-20 14:06:15 +00:00
Jason Wilder 1c8676b4a3 Rebuild corrupted fields index when necessary
If the fields.idx was corrupted in someway, it would cause the shard
to fail to load.  Deleting the file will allow it to be rebuilt.

This change handles this automatically so it's rebuilt if necessary
without user intervention.
2018-01-16 11:31:07 -07:00
Edd Robinson 74481b9415 Fix shard tests 2018-01-15 12:00:30 +00:00
Jason Wilder ba9a5af7eb Mark series deleted in series file
This commit adds the ability to correctly mark a series as deleted in
the global series file. Whenever a shard engine determines that a series
should be deleted, it checks with each shard's bitset for series that
are to be deleted and are no longer contained in any shard-local
bitsets.

These series are then removed from the series file.
2018-01-15 12:00:30 +00:00
Stuart Carnie 5dfe3b2645 inmem startup improvments
* only call ParseTags when necessary
* remove dependency on inmem.Series in tsdb test package
* Measurement and Series are no longer exported. Their use is restricted
  to the inmem package
* improve Measurement and Series types by exporting immutable
  fields and removing unnecessary APIs and locks

Reduced startup time from 28s to 17s. Overall improvement including
#9162 reduces startup from 46s to 17s for 1MM series across 14 shards.
2017-12-29 07:58:52 -07:00
Edd Robinson 73fcf894b6 Fix shard races when accessing index 2017-12-15 18:19:55 +00:00
Edd Robinson 9e3b17fd09 Ensure deleted series are not returned via iterators 2017-12-14 21:29:35 +00:00
Edd Robinson 077cbba0e8 Fix index tests 2017-12-12 21:25:35 +00:00
Edd Robinson f6835632e7 Merge master into branch 2017-12-08 17:11:07 +00:00
Ben Johnson 0e0e7cfc08
Fix tests. 2017-12-07 09:59:39 -07:00
Ben Johnson e0df47d54f
Fixing up tests. 2017-12-02 16:52:34 -07:00
Jason Wilder 887bca752e Skip flaky test on windows 2017-11-28 16:43:45 -07:00
Jason Wilder b674311830 Add magic number to fields index file 2017-11-22 11:17:34 -07:00
Jason Wilder dd1c030815 Remove limit count param on fields
It's not used anymore.
2017-11-22 11:17:34 -07:00
Jason Wilder c14b0e81b7 Save field types to speed up startup
This persists the field types in a shard to avoid having to scan
all the TSM files at startup.
2017-11-22 11:17:34 -07:00
Ben Johnson fc966a1b67
Add series file backup/restore. 2017-11-22 08:55:54 -07:00
Edd Robinson aa17ef55f9 Implement FGA on SHOW SERIES 2017-11-17 11:06:43 +00:00
Ben Johnson ede3fcf98e
intermediate 2017-11-15 16:09:25 -07:00
Ben Johnson 0ffd94a37a
Fix rebase 2017-11-09 09:25:10 -07:00
Edd Robinson 98d584b63f Use index for SHOW X meta queries
When a meta query does not include a time component then it can be
answered exclusively by the index. This should result in a much faster
query execution that if the TSM engine was engaged.

This commit rewrites the following queries such that they make use
of the index where no time component is present:

  - SHOW MEASUREMENTS
  - SHOW SERIES
  - SHOW TAG KEYS
  - SHOW FIELD KEYS
2017-11-06 19:15:00 +00:00
Stuart Carnie f3d45ba301 influxdata/influxdb/influxql -> influxdata/influxql 2017-10-30 14:40:26 -07:00
Stuart Carnie e9313876ab EXPLAIN ANALYZE
* Introduces EXPLAIN ANALYZE command, which
  produces a detailed tree of operations used to
  execute the query.

introduce context.Context to APIs

metrics package

* create groups of named measurements
* safe for concurrent access

tracing package

EXPLAIN ANALYZE implementation for OSS

Serialize EXPLAIN ANALYZE traces from remote nodes

use context.Background for tests

group with other stdlib packages

additional documentation and remove unused API

use influxdb/pkg/testing/assert

remove testify reference
2017-10-20 08:01:37 -07:00