When an InfluxDB database is very busy writing new points the backup
the process can fail because it can not write a new snapshot.
The error is: `operation timed out with error: create snapshot: snapshot in progress`.
This happens because InfluxDB takes almost "continuously" a snapshot
from the cache caused by the high number of points ingested.
This PR skips snapshots if the `snapshotter` does not come available
after three attempts when a backup is requested.
The backup won't contain the data in the cache or WAL.
Signed-off-by: Gianluca Arbezzano <gianarb92@gmail.com>
refactor(influxdb): Refactor trace code for clarity and reliability
* Make startProfile a method of Server. startProfile() is only used one
place. We bother to copy the Server.CPUProfile and Server.MemProfile
values out of our Server struct into it's parameters. It makes more
sense to make startProfile() a method of Server and have it access
those members directory via its receiver value.
* Have startProfile() return an error instead of log.Fatal()ing. We can
simply propagate the error up the stack and let the caller handle the
error -- we shouldn't be exiting deep in the bowels of a non-main
package.
* Capture and return errors from pprof.StartCPUProfile(). Currently
there is only one possible error it can return but if it returns an
error, we should handle it.
* add CPUProfileWriteCloser and MemProfileWriteCloser to Server struct.
* make stopProfile() a method of Server
* remove prof variable
* fix captialization of log messages.
This PR lazily initializes Flux built-in functions when Flux is used.
It significantly reduces the startup time of the `influxd` and `influx`
binaries.
Before (4.66s):
```
↳ time bin/18/influx
bin/18/influx 4.66s user 0.19s system 198% cpu 2.441 total
```
After (10ms):
```
↳ time bin/18/influx
bin/18/influx 0.01s user 0.01s system 88% cpu 0.021 total
```
Prior to this change, new series would be added to the series file
before checking the series cardinality limit. If the limit was exceeded,
the write was rejected even though the series had already been added to
the series file.
This commit prevents multiple blocks for the same series key having
values truncated when they are being read into an empty buffer.
The current cursor reader code has an optimisation that incorrectly
assumes the incoming array will be limited to 1,000 values (the maximum
block size), but arrays can contain values from multiple matching
blocks.
* fix(storage): skip TSM files with block read errors
When we find a bad TSM file during compaction, propagate the error up and move
the bad file aside. The engine will disregard the file so the next compaction
will not hit the same error.
This change adds a lock around digest creation so that it is safe for
concurrent calls. Prior to this change, calls from multiple goroutines
resulted in "Digest aborted, problem renaming tmp digest" errors.
Fixes#15859
This commit fixes a defect in the TSI index where a filter using the
negated equality operator would result in no matching series being
returned for series stored within the `IndexFile` portions of the index.
The root cause of this was due to missing legacy-handling code in the
index for this particular iterator.
This upgrades the flux version to v0.50.2.
The secret service, which is used for alerts, is not included. The
`to()` function is also still not included.