When a dataFile is deleted, the f file pointer is set to nil. Since deleting
a file happens asynchronously, code that had a reference when it was valid may
run when it's gone.
dataFile was not protected by a mutex which causes a data race and live
code and tests. filesAndLock used reflect.DeepEqual on a copy of dataFile
slices. reflect.DeepEqual appears to access unexported dataFile fields
which can't be protected. This was changed to use a equals func that will
require a mutex to be acquired.
The other issue was that many of the dataFile funcs access the mmap without
acquiring a lock. When a dataFile is deleted (possibly during rewriting),
reads from the mmap could return invalid data because references to the dataFile
are still in use by other goroutines.
Fixes#4534
Mainly for debugging as since this should not happen going forward. Since
there may be points with NaN already stored in the WAL, this is helpful for
troubleshooting panics.
Float values are not supported in the existing engine and the tsm1
engines. This changes NewPoint to return an error if a field value
contains a NaN field. It also allows us to validate fields to prevent
other unsupported types from sneaking in through other input plugins.
When a database is dropped, removing old segments returns an error
because the files are already gone. Using RemoveAll handles this
case more gracefully.
If a drop database is executed while writes are in flight, a panic
could occur because the WAL would fail to write to the DB dirs where
had been removed.
Partil fix for #4538
The Stat+Remove calls are unnecessary because Rename will replace
the destination file if it exist or not. There is no need to remove
the destination file before calling Rename.
Several places use os.Remove and check for os.ErrNotExist. os.Remove
does not return os.ErrNotExit, it returns a *PathError so these remove
calls will panic if the file does not exist.
Instead use os.RemoveAll that will not return an error if the file does
not exist.
Fixes#4545
* refactor compaction
* rework compaction cleanup logic to work with multiple resulting files
* ensure the uint64 number for a series key doesn't use 0 or MaxInt64 for sentinel values
Close acquired the cacheLock and writeLock in a different order than flush. If addToCache was also
running in a goroutine (acquiring cacheLock), a deadlock could happen.
panic: error opening new segment file for wal: open /var/folders/lj/vlbynqp52pxdxxlxx64j6bk80000gn/T/tsm1-test709000715/_00002.wal: no such file or directory
goroutine 8 [running]:
github.com/influxdb/influxdb/tsdb/engine/tsm1.(*Log).writeToLog(0xc820098500, 0x1, 0xc8201584b0, 0x1c, 0x45, 0x0, 0x0)
/Users/jason/go/src/github.com/influxdb/influxdb/tsdb/engine/tsm1/wal.go:427 +0xc19
When rewriting a tsm file, a panice on the Values slice could happen
if there were no values in the slice and the conditions of the rewrite
causes DecodeAndCombine to be called with the empty slice. This could
happen is the sizes of the points new values was equal to
the MaxPointsInBlock config options and there were no future blocks after
the current one being written.
When this happens, DecodeAndCombine returns a zero length remaining values
slice which is passed back into DecodeAndCombine one last time. In this case,
we now just return the original block since there is nothing new to combine.
Fixes#4444#4365
This will help large integer counters type fields that increment by
small amounts over time. Instead of storing the larger raw value
in a compressed format, we store the difference from the prior value
in compressed format which allows the value to be stored using
fewer bits.
influx_inpsect uncovered some scenarios where timestamps could be stored using
run-length encoding but were being stored using simple8 which uses more space.
If DecodeSameTypeBlock is called on on an empty Values slice, it would
panic with an index out of bounds error. This func can actually be removed
because DecodeBlock can determine what type of values are encoded already.
This will still panic if the block cannot be decoded due to other reasons.
Fixes#4365
If similar float values were encoded, the number of leading bits would
overflow the 5 available bits to store them (e.g. store 33 in 5 bits). When
decoding, the values after the overflowed value would spike to very large and
small values.
To prevent the overflow, we clamp the value to 31 which is the maximum
number of leading zero bits we can encoded.
Fixes#4357