Commit Graph

123 Commits (db/panic-at-the-cursor)

Author SHA1 Message Date
Geoffrey Wossum 96bade409e
feat: add option to flush WAL on shutdown (#25444)
* feat: add option to flush WAL on shutdown

Add `--storage-wal-flush-on-shutdown` to flush WAL on database shutdown.
On successful shutdown, all WAL data will be committed to TSM files and the
WAL directories will not contain any .wal files.

Closes: #25422
2024-10-10 15:27:54 -05:00
Geoffrey Wossum da9615fdc3
chore: improve error messages and logging during shard opening (#25331)
Ported from master-1.x.

(cherry picked from commit 23008e5286)

Closes: #25328
2024-09-13 16:59:17 -05:00
Phil Bracikowski 5d801119c5
feat(tsm1/wal): encapsulate expiring WAL files in FileDisposer (#24611)
* feat(tsm1/wal): encapsulate expiring WAL files in FileDisposer

This changeset introduces an interface extension point named
FileDisposer to control what to do with WAL files when they are no
longer needed. Currently, the only implementation is to delete the file
which is the existing behavior.

* chore: accumulate errors

Since we're here, capture the previously ignored fs errors and pass up a
combined error (which the only callers log out).
2024-01-31 12:46:46 -08:00
Eng Zer Jun 903d30d658
test: use `T.TempDir` to create temporary test directory (#23258)
* test: use `T.TempDir` to create temporary test directory

This commit replaces `os.MkdirTemp` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.

Prior to this commit, temporary directory created using `os.MkdirTemp`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
	defer func() {
		if err := os.RemoveAll(dir); err != nil {
			t.Fatal(err)
		}
	}
is also tedious, but `t.TempDir` handles this for us nicely.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestSendWrite on Windows

=== FAIL: replications/internal TestSendWrite (0.29s)
    logger.go:130: 2022-06-23T13:00:54.290Z	DEBUG	Created new durable queue for replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestSendWrite1627281409\\001\\replicationq\\0000000000000001"}
    logger.go:130: 2022-06-23T13:00:54.457Z	ERROR	Error in replication stream	{"replication_id": "0000000000000001", "error": "remote timeout", "retries": 1}
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestSendWrite1627281409\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestStore_BadShard on Windows

=== FAIL: tsdb TestStore_BadShard (0.09s)
    logger.go:130: 2022-06-23T12:18:21.827Z	INFO	Using data dir	{"service": "store", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestStore_BadShard1363295568\\001"}
    logger.go:130: 2022-06-23T12:18:21.827Z	INFO	Compaction settings	{"service": "store", "max_concurrent_compactions": 2, "throughput_bytes_per_second": 50331648, "throughput_bytes_per_second_burst": 50331648}
    logger.go:130: 2022-06-23T12:18:21.828Z	INFO	Open store (start)	{"service": "store", "op_name": "tsdb_open", "op_event": "start"}
    logger.go:130: 2022-06-23T12:18:21.828Z	INFO	Open store (end)	{"service": "store", "op_name": "tsdb_open", "op_event": "end", "op_elapsed": "77.3µs"}
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestStore_BadShard1363295568\002\data\db0\rp0\1\index\0\L0-00000001.tsl: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestPartition_PrependLogFile_Write_Fail and TestPartition_Compact_Write_Fail on Windows

=== FAIL: tsdb/index/tsi1 TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s)
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_PrependLogFile_Write_Failwrite_MANIFEST656030081\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process.
    --- FAIL: TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s)

=== FAIL: tsdb/index/tsi1 TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s)
    testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_Compact_Write_Failwrite_MANIFEST3398667527\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process.
    --- FAIL: TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s)

We must close the open file descriptor otherwise the temporary file
cannot be cleaned up on Windows.

Fixes: 619eb1cae6 ("fix: restore in-memory Manifest on write error")
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestReplicationStartMissingQueue on Windows

=== FAIL: TestReplicationStartMissingQueue (1.60s)
    logger.go:130: 2023-03-17T10:42:07.269Z	DEBUG	Created new durable queue for replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"}
    logger.go:130: 2023-03-17T10:42:07.305Z	INFO	Opened replication stream	{"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"}
    testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestReplicationStartMissingQueue76668607\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: update TestWAL_DiskSize

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestWAL_DiskSize on Windows

=== FAIL: tsdb/engine/tsm1 TestWAL_DiskSize (2.65s)
    testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestWAL_DiskSize2736073801\001\_00006.wal: The process cannot access the file because it is being used by another process.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

---------

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-03-21 16:22:11 -04:00
Sam Arnold b970e359dc
feat: remaining storage metrics from OSS engine (#22938)
* fix: simplify disk size tracking

* refactor: EngineTags in tsdb package

* fix: fewer compaction buckets and dead code removal

* feat: shard metrics

* chore: formatting

* feat: tsdb store metrics

* feat: retention check metrics

* chore: fix go vet

* fix: review comments
2021-12-02 09:01:46 -05:00
Sam Arnold edb21abe91
feat: metrics for wal subsystem (#22918)
https://github.com/influxdata/influxdb/issues/20026
2021-11-23 12:17:52 -05:00
Daniel Moran d747e7ec4e
feat: add config parameters to toggle WAL concurrency and timeouts (#21621)
* feat: add context parameter to Take() method on fixed limiter
* refactor: plumb context through to uses of Take()
* test: update tests to pass context as needed
* feat: add config toggles for setting WAL write concurrency & timeout
2021-06-09 11:03:53 -04:00
Yun Zhao ce536037dc
fix(tsm1): limit concurrent WAL encodings to reduce memory pressure under heavy write load (#20814)
Co-authored-by: zhaoyun.248 <zhaoyun.248@bytedance.com>
2021-06-03 16:11:36 -04:00
Yun Zhao 265c1f311e
fix(tsm1): fix wal's totalOldDiskSize statistics (#20811) 2021-03-03 15:20:24 -05:00
Tristan Su 1a00f2f123
fix(tsm): should not check write-ahead-log size against default size (#20585)
it should check against the local saved SegmentSize instead of the
default const DefaultSegmentSize.
2021-02-10 10:32:53 -05:00
Stuart Carnie dee8977d2c
chore: move v2/v1/tsdb → v2/tsdb 2020-08-26 10:46:47 -07:00
Mark Rushakoff f2898d1992 Wipe out workspace in preparation for v2 merge
"Knock knock."

"Who's there?"

"InfluxDB Veet."

...
2019-01-11 10:38:50 -08:00
Edd Robinson 0fc7643d59 Fix data race in WAL
This commit fixes a data race in the WAL, which can occur when writes
and deletes are being executed concurrently. The WAL uses a buffer pool
of `[]byte` when reading the WAL. WAL entries are unmarshaled into these
buffers and passed along to the relevant methods handling the different
types of entry (write, delete etc).

In the case of deletes, the keys that need to be deleted were being
stored for later processing, however these keys were part of the backing
array of initial buffer from the pool. As such, those keys could be
written to at a future time when handling other parts of the WAL.
2018-03-15 12:51:30 +00:00
Jonathan A. Sternberg d38413a849
Merge pull request #9454 from influxdata/js-structured-logging
Update logging calls to take advantage of structured logging
2018-02-21 09:14:40 -06:00
Jason Wilder f7279b57f3 Re-open last WAL segment
Re-open the last wal segment instead of creating a new one.  This fixes
an issue where the last modified time of the WAL would change on
restart.  It also avoids a lot of IO file churn on restart.
2018-02-20 14:24:04 -07:00
Jonathan A. Sternberg 2bbd96768d Update logging calls to take advantage of structured logging
Includes a style guide that details the basics of how to log.
2018-02-20 10:04:19 -06:00
Jason Wilder 3299e549aa Increase WAL write buffer size
The default of 4096 results in writes to the WAL still requiring muliple
IOs.  We had previously bumped this to 1M, but that was too high when
there are many shards.  Increasing to around 16k reduces the IOs to
one or two for the workloads tested.  We may want to make this
configurable in the future.
2018-01-31 13:55:32 -07:00
Joe LeGasse 68e20c4f80 wal: update lastWriteTime behavior 2018-01-16 21:22:24 -05:00
Jonathan A. Sternberg 0b7c56bcd8 Update the zap logger dependency
The previous sha was taken from a revision on a devel branch that I
thought would continue staying in the tree after it was merged. That
revision was rebased away and the API was changed for the logger.

This updates the usage of the logger and adds a simple package for
constructing the base logger.

The 1.0 version of zap changed the format of the default console logger
so this change moves over to this new logger instead of attempting to
retain backwards compatibility with the old format.
2017-11-10 16:27:16 -06:00
Jason Wilder fb7135ddc8 Fix corrupted wal segment panic on 32 bit systems 2017-10-16 09:41:20 -06:00
Jason Wilder f668b0cc3f Only use O_SYNC for tsm file writing
Doing this for the WAL reduces throughput quite a bit.
2017-10-03 10:48:13 -06:00
Jason Wilder 122a74c692 Use synchronous IO for wal and tsm writing
The fysncs due to large writes when writing to TSM files and the
WAL can eventually cause large pauses.  Since we already buffer
writes, using synchronous IO reduces fsync latency by ensuring
the individiual writes hit disk.  This spreads out the latecncy
across multiple writes better.
2017-09-25 12:44:57 -06:00
Jason Wilder 78922f9821 Set rc to nil when closing WALSegmentReader 2017-09-08 14:55:02 -06:00
Jason Wilder 5581f8b4ae Re-use WALSegmentReaders at startup 2017-09-07 12:56:17 -06:00
Jason Wilder 778000435a Conver all keys from string to []byte in TSM engine
This switches all the interfaces that take string series key to
take a []byte.  This eliminates many small allocations where we
convert between to two repeatedly.  Eventually, this change should
propogate futher up the stack.
2017-07-28 11:00:50 -06:00
Stuart Carnie eec80692c4 Taught tsm1 storage engine how to read and write uint64 values
* introduced UnsignedValue type
  * leveraged existing int64 compression algorithms (RLE, Simple 8B)
* tsm and WAL can read and write UnsignedValue
* compaction is aware of UnsignedValue
* unsigned support to model, cursors and write points

NOTE: there is no support to create unsigned points, as the line
protocol has not been modified.
2017-07-24 09:03:22 -07:00
Jason Wilder e9370e0b86 Fix indefinite hang in WAL.writeToLog
There was a race in the WAL writeToLog and scheduleSync which could
lead to a writing goroutine blocking indefinitely on its syncErr channel.

The issue was that the clearing of the syncCount happenend after the
wal was unlock.  If a goroutine was able to lock, write and call scheduleSync
before the existing scheduleSync goroutine returns and ran the defer to
clear the syncCount, then a new scheduleSync goroutine would not get started.
This left the writing goroutine block with nothing to signal it.

While in this state, a RLock on the engine was held.  If a Lock was requested
on the engine during this time, all future writes and queries would block waiting
on the blocked wal writer.

The fix is to move the atomic clearing of syncCount before the Lock is released.
2017-07-07 13:31:52 -06:00
Stuart Carnie c863923e68 cache MarshalSize 2017-05-12 14:05:25 -06:00
Stuart Carnie 0151afe31c check size and allocate once 2017-05-12 14:05:25 -06:00
Stuart Carnie 096d6f65b4 explicit sizes 2017-05-12 14:05:24 -06:00
Jason Wilder 503d41a08f Add LimitedBytePool for wal buffers
This pool was previously a pool.Bytes to avoid repetitive allocations.
It was recently switchted to a sync.Pool because pool.Bytes held onto
very larger buffers at times which were never released.  sync.Pool is
showing up in allocation profiles quite frequently.

This switches the pool to a new pool that limits how many buffers are
in the pool as well as the max size of each buffer in the pool.  This
provides better bounds on allocations.
2017-05-11 11:27:00 -06:00
Jason Wilder e102fcca9c Use buffer writer for wal segments 2017-05-10 11:42:32 -06:00
Jason Wilder 88848a9426 Remove per shard monitor goroutine
The monitor goroutine ran for each shard and updated disk stats
as well as logged cardinality warnings.  This goroutine has been
removed by making the disks stats more lightweight and callable
direclty from Statisics and move the logging to the tsdb.Store.  The
latter allows one goroutine to handle all shards.
2017-05-03 16:31:57 -06:00
Jason Wilder 137d0c0d09 Rename WAL.WritePoints to WAL.WriteMulti
To match Cache.WriteMulti
2017-04-28 13:20:55 -06:00
Jason Wilder d88604f6f2 Move repetive loop checks outside of values loop 2017-04-20 13:45:04 -06:00
Jason Wilder 888689f5d3 Move values loop under type switch
All the values read must be of the same type so repeatedly using
the type switch is confusing and less efficiient.
2017-04-20 13:39:49 -06:00
Jason Wilder b0988511bf Use fixed size array instead of slice 2017-04-20 13:38:33 -06:00
Jason Wilder da6bdfdda8 Use bufio.Reader when reading wal segments
Reduces disk IO due to small reads.
2017-04-20 13:33:42 -06:00
Jason Wilder 8e9cbd7ffc Simplify WALSegmentReader.UnmarshalBinary
There were two loops over nvals which created some extra allocation
which coudl be replaced with a simplet slice capacity and append.
2017-04-20 13:33:42 -06:00
Jason Wilder ef65ee77f4 Switch WAL byte pools to sync/pool
The current bytes.Pool will hold onto byte slices indefinitely. Large
writes can cause the pool to hold onto very large buffers over time.
Testing w/ sync/pool seems to perform similarly now so using a sync/pool
will allow these buffers to be GC'd when necessary.
2017-04-20 12:28:42 -06:00
Jason Wilder d7c5dd0a3e Reduce wal sync goroutine churn
Under high write load, the sync goroutine would startup, and end
very frequently.  Starting a new goroutine so frequently adds a small
amount of latency which causes writes to take long and sometimes timeout.

This changes the goroutine to loop until there are no more waiters which
reduce the churn and latency.
2017-04-20 12:28:34 -06:00
Jason Wilder aa9925621b Fix deadlock in wal
If the sync waiters channel was full, it would block sending to the
channel while holding a the wal write lock.  The sync goroutine would
then be stuck acquiring the write lock and could not drain the channel.

This increases the buffer to 1024 which would require a very high write
load to fill as well as retuns and error if the channel is full to prevent
the blocking.
2017-04-19 11:33:13 -06:00
Jason Wilder c443e639b0 Fix 32bit alignment issue in wal.sync 2017-03-22 11:21:29 -06:00
Jason Wilder 8f7b251afd Merge branch 'master' into jw-tsi 2017-03-20 17:17:26 -06:00
Jason Wilder e9eb925170 Coalesce multiple WAL fsyncs
Fsyncs to the WAL can cause higher IO with lots of small writes or
slower disks.  This reworks the previous wal fsyncing to remove the
extra goroutine and remove the hard-coded 100ms delay.  Writes to
the wal still maintain the invariant that they do not return to the
caller until the write is fsync'd.

This also adds a new config options wal-fsync-delay (default 0s)
which can be increased if a delay is desired.  This is somewhat useful
for system with slower disks, but the current default works well as
is.
2017-03-15 16:31:03 -06:00
Ben Johnson 358b1e0b05
Merge remote-tracking branch 'upstream/master' into tsi 2017-03-15 10:13:32 -06:00
Mark Rushakoff 601cbcd084 Merge branch '1.2' into mr-merge-12 2017-02-17 16:14:22 -08:00
Jonathan A. Sternberg 2fe48d6781 Rename zap import back to github.com/uber-go/zap
They rebased a revision we were previously relying upon that allowed us
to use the vanity name so we are reverting back to an older version with
the old import path.
2017-02-17 17:17:22 -06:00
Jason Wilder 93a9d01643 Increase default waiting WAL writes 2017-02-06 11:48:51 -07:00
Jason Wilder 38a649fc40 Batch multiple WAL fsyncs
Every write to the WAL current runs and fsync before returning.  When
there are lot of concurrent writes, this can cause the WAL to bottleneck
write throughput since fsyncs are very expensive.

This changes the writeToLog to fsync on an interval to allow multiple fsyncs
calls to be batched up into one.  The writeToLog behavior is the same in that
it won't return until an fsync has been performed.
2017-02-06 11:48:45 -07:00