* fix: prevent retention service from hanging
Fix issue that can cause the retention service to hang waiting on a
`Shard.Close` call. When this occurs, no other shards will be deleted
by the retention service. This is usually noticed as an increase in
disk usage because old shards are not cleaned up.
The fix adds to new methods to `Store`, `SetShardNewReadersBlocked`
and `InUse`. `InUse` can be used to poll if a shard has active readers,
which the retention service uses to skip over in-use shards to prevent
the service from hanging. `SetShardNewReadersBlocked` determines if
new read access may be granted to a shard. This is required to prevent
race conditions around the use of `InUse` and the deletion of shards.
If the retention service skips over a shard because it is in-use, the
shard will be checked again the next time the retention service is run.
It can be deleted on subsequent checks if it is no longer in-use. If
the shards is stuck in-use, the retention service will not be able to
delete the shards, which can be observed in the logs for manual
intervention. Other shards can still be deleted by the retention service
even if a shard is stuck with readers.
closes: #25054
ArrayCursors were ignoring errors, which led to panics when nil
cursors were operated on. This fix passes errors back up the stack
and uses them to enforce healthy cursor creation.
Closes https://github.com/influxdata/influxdb/issues/24789
---------
Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Upgrade to influxdata/influxql v1.2.0. While it does not fix any
known issues in InfluxDB OSS 1.x, it is necessary for upstream
projects impacted by https://github.com/influxdata/influxql/issues/65.
In addition to upgrading influxdata/influxql, this also updates test
cases that relied on the erroneous precision handling when stringifying
InfluxQL ASTs.
Visible impacts to InfluxDB OSS 1.x:
- Changes precision of floating point numbers in error messages
related to InfluxQL
- Changes precision of floating point numbers in "EXPLAIN" and
"EXPLAIN ANALYZE" output
- Changes precision of floating point numbers from InfluxQL
expressions included in tracing spans
closes: #24763
* chore: add scaffolding for naive solution
* feat: test case scaffolding
* fix: implement check for series key before proceeding
* fix: add validation for ReadSeriesKeyMeasurement usage
* refactor: explicit use of series key len
* feat: add remaining check to index
* feat: add check to remaining files
As the Len function is used as part of the parseSeriesKey, this also needs to be accounted for on the nil return from this function as it is used in different contexts
* feat: expand test cases
* chore: go fmt
* chore: update test failure message
* chore: impl feedback on unnecessary sz checks
* feat: expand test cases
* fix: nil series key check
In both sections for index.go there is a pre-existing length check against the series key which should catch invalid values, perhaps this explains why it hasn't cropped up in the reported panics. For even more safety, we can also skip a nil key because we know that subsequent calls will cause a panic where this key is attempted to be used
* fix: remove nil tags check
A key with no tags is valid, so we should not check for BOTH nil key and tags as a key could be nil, which is invalid, yet still have tags and therefore cause the check to pass which we do not want
* feat: extend test cases from feedback
* fix: extend checks for CompareSeriesKeys
* feat: add nilKeyHandler for shared key checking logic
* fix: logical error in nilKeyHandler
Prior to this, the else was always defaulted to at the end of the conditional branch, which causes unexpected behaviour and a failure of a bunch of tests.
* fix: return tags keep nil data
In a recent change to this, we agreed on a simple name == nil check for the actual data. As a follow on to this, I just realised that we don't actually want to nil back the tags, even if they're not checked, because having no tags is a valid input so we can simply return whatever we were passed unchanged.
* fix: use len == 0 for extra safety
* feat: extra test for blank series key
* fix: prevent retention service creating orphaned shard files
Under certain circumstances, the retention service can fail to delete shards from
the store in a timely manner. When the shard groups are pruned based on age, this
leaves orphaned shard files on the disk. The retention service will then not attempt
to remove the obsolete shard files because the meta store does not know about them.
This can cause excessive disk space usage for some users.
This corrects that by requiring shards files be deleted before they can be removed
from the meta store.
fixes: #24529
Some series files which are smaller than the standard
sizes cause SIGBUS in influx_inspect and influxd, because
entry iteration walks onto mapped memory not backed by the
the file. Avoid walking off the end of the file while
iterating series entries in oddly sized files.
closes https://github.com/influxdata/influxdb/issues/24508
Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
* chore: delete unused signing script
* fix: execute "compact-series-file" at startup
* fix: add "INFLUXD_OPTS" to "influxd config" line
* fix: use CONFIG to store influxd config path
Series file indices monotonically grew even
when series were deleted. Also stop
ignoring error in series index recovery
Partially closes https://github.com/influxdata/EAR/issues/3643
An empty slice in Sources causes an empty
WHERE clause to be generated by
Statement.String(). Use a nil in the field
to omit the WHERE clause completely
closes https://github.com/influxdata/influxdb/issues/24144
Setting the DB or RP in measurements extracted from
the predicate and used as sources causes Enterprise
/api/v2/delete to fail. The DB must explicitly be passed
to Store.Delete without being set in the measurement
struct passed as a source.
closes https://github.com/influxdata/influxdb/issues/24139
The CLI code used a single function from the `golang.org/x/crypto/ssh/terminal`
package (`terminal.IsTerminal`) which is just a wrapper around the
`golang.org/x/term` package's `term.IsTerminal` function. Replacing this
call prevents unnecessary and non-FIPS crypto functions from being pulled
into the binary.
closes: #24035
Absolute file paths in influx_inspect check-schema
cause an 'Invalid Argument' error. This was caused
fs.WalkDir using fs.ValidPath. Replacing with
filepath.WalkDir permits absolute paths.
closes https://github.com/influxdata/influxdb/issues/23987