Before this, if you deleted everything with `delete where true`
for example, then you would be left with all of your measurements
in the fields index. That would cause ghost fields to reappear
if someone reinserted to the measurement.
This fixes that by making it so the deepest most delete code
checks if the measurement was removed from the index, and if so
cleaning it up out of the fields index.
Additionally, it fixes bugs in that cleanup code where if you had
a measurement like "m1" and "m10", when iterating over the cache
or file store, "m1" would match "m10" due to it only checking the
prefix. This also has it check the character right after the
measurement to be either a comma because tags started, or the first
character of the field separator.
This change fixes#10511 that manifests when a shard is considered cold
faster than its cache is snapshotted. This can happen if WAL is enabled
because previously the code only considered the last modification of
compacted tsm1 files. Instead Engine.LastModified() also takes the WAL
into account if necessary.
There are some problematic races that occur when deletes happen
against writes to the same points at the same time. This change
introduces guards and an epoch based system to coordinate these
modifications.
A guard matches a point based on the time, measurement name, and
some conditions loaded from an influxql expression. The intent
is to be as precise as possible without allowing any false
neagatives: if a point would be deleted, the guard must match it.
We are allowed to match more points than necessary, at the cost
of slowing down writes.
The epoch based system keeps track of outstanding writes and
deletes and their associated guards. When a delete operation
is going to start, it waits until all current writes are
done, and installs its guard, blocking all future writes that
contain points that may conflict with the delete. This allows
writes to disjoint points to proceed uncontended, and the
implementation is optimized for assuming there are few
outstanding deletes. For example, in the case that there are no
deletes, a write just has to take a mutex, bump a counter, and
compare a value against zero. The epoch trackers are per shard,
so that different shards never have to contend with one another.
TSI1 and inmem indexes have different properties during deletes.
Specifically, inmem shares a global index across all shards, where
every tsi1 index is contained to a specific shard. When deleting
a series, it may cause the last reference to the series across all
shards to be dropped, necessitating a removal from the series file.
Since the inmem index shares the index across all shards, removing
the series when it's removed from the series file is sufficient.
However, in the case of a mixed index database, if the last shard
is a TSI1 shard, the other inmem indexes are not available when we
discover that it was the last reference to the series. This ends
up leaving the series in the inmem index without a series id in
the series file, causing all sorts of misbehavior.
Rather than continue curling ourselves into a ball to try to fix
this unsupported mode, give a helpful error message to the user
that they must run their database in a non-mixed index mode to
allow deletes.
Most parts of `ApplyEnvOverrides` will check if the value is empty
before attempting to apply the environment override. One exception is
that when a type implements Unmarshaler and it is within a slice, it
does not check that an empty value was passed.
This change fixes it so that we will not call `UnmarshalText` when the
environment variable is set to empty.
If 120th or 240th value is not a 1, k still passes the check in the
switch, causing the last value to be lost. If this value occurs at
the boundary of a block, the max time will be incorrect, resulting in
compaction failing to make forward progress.
The access log filter allows the access log to be filtered by a status
code pattern. The pattern is a list of strings of the form `NXX`. At
least one number must be specified and up to 2 Xs can be used. That
means the filter can be an exact status code or it can be a range of
them. For example, `500` would only match the 500 http status code while
`5XX` would match any status code beginning with the number 5
(categorized as server errors). The pattern `50X` would also be
accepted. Both uppercase and lowercase Xs are allowed.
Multiple filters can be specified and the log line will be printed if
one of them matches. If there are no filters specified, all status codes
are printed.
Removes cloning measurement fields on writes, instead atomically swaps out
measurement field sets when fields are added (with new overhead of copying
existing fields whenever a new one is added).