Fixes#8819.
Previously, the process of dropping expired shards according to the
retention policy duration, was managed by two independent goroutines in
the retention policy service. This behaviour was introduced in #2776,
at a time when there were both data and meta nodes in the OSS codebase.
The idea was that only the leader meta node would run the meta data
deletions in the first goroutine, and all other nodes would run the
local deletions in the second goroutine.
InfluxDB no longer operates in that way and so we ended up with two
independent goroutines that were carrying out an action that was really
dependent on each other.
If the second goroutine runs before the first then it may not see the
meta data changes indicating shards should be deleted and it won't
delete any shards locally. Shortly after this the first goroutine will
run and remove the meta data for the shard groups.
This results in a situation where it looks like the shards have gone,
but in fact they remain on disk (and importantly, their series within
the index) until the next time the second goroutine runs. By default
that's 30 minutes.
In the case where the shards to be removed would have removed the last
occurences of some series, then it's possible that if the database was already at its
maximum series limit (or tag limit for that matter), no further new series
can be inserted.
When a downstream server such as a proxy or loadbalancer between
influxdb and the client produces an error, the client currently does
not make this very obvious.
This change introduces checks on both the content type and the
influx version header to identify whether a request was served by
influxdb itself and returns a more appropriate error in the cases
where it can be determined a downstream issue is at play.
* Fprint* functions
* No nakedness
* clarify panic messages
* spacing between case statements
* remove break in favor of return
* remove goto in favor of for { continue }
* batch cursors return slices of timestamps and values to reduce call
overhead. Significantly improved iteration.
* added CreateCursor API to Shard, Engine
* moved build*Cursor to code gen
* array has already been sized correctly
* eliminates bounds checking for each element access
* reduces decoding of 30,000,000 points via storage API from
584ms to 540ms on average
Fixes#8989 and #8633.
Previously when issuing commands involving a regex check, walking
through the tags keys/values on a measurement, using the measurement's
index, would be racy.
This commit adds a new `TagKeyValue` type that abstracts away the
multi-layer map we were using as an inverted index from tag keys and
values to series ids. With this abstraction we can also make concurrent
access to this inverted index goroutine safe.
Finally, this commit fixes a very old bug in the index which will affect
any query using a regex. Previously we would always check _every_ tag
against a regex for a measurement, even when we had found a match.
Changes the `influx_inspect inmem2tsi` tool to stream each TSM/WAL
file and convert to a TSI index instead of loading the entire shard's
in-memory index first.