The cursors were returning the wrong value in the case when points
existed in both the cache and tsm files with the same timestamp. The
cache value should have been returned, but the tsm value was returned
incorrectly.
Fixes#6439
If a shard is empty for a specific field and the field type is something
other than a float, a nil iterator would get returned from one of the
empty shards and cause the combined iterators to be cast to the float
type and all other iterator types to be discarded (or for integers, to
be cast).
This is rare since most aggregates don't accept strings or booleans, but
for queries like:
SELECT distinct(string) FROM mydata
It would result in nothing getting returned if one of the shards didn't
have a value for `string`.
This change modifies the query engine to return nil for the shards
instead of a fake iterator and then to only use the fake iterator if the
final aggregate iterator is nil (meaning that no iterators could be
constructed for the field from any shard).
Fixes#6495.
If multiple tombstone entries happen to exist for the same key in a
tombstone file, it was possible to panic. The first application
would remove all index entries and the second time around the code
still assumed entries would exist and would index into the nil slice.
Also fixes a case where the range of time would fully delete all index
entries, but it did not align with math.MinInt64 and math.MaxInt64. This
would cause the index locations to still exist in the offset slice. This
is inefficient because the BlockIterator would still scan and decode the block
only to discover that all the values are deleted. We now just remove it from
the offsets slice in this case since the range of values are deleted.
When a large tombstone file existed on disk, this code was slow since
it would apply each tombstone to the index one at a time causing the
index to be scanned for each key.
Instead, we group all the tombstones together by timestamp and apply
in bulk so that the index in scan once for each set of tombstones.
If we change to immuntable tombstone files, it might be better to just
write a file where all the keys have the same tombstone so we can re-apply
them efficiently.
This was the wrong fix. The real issue was the tombstones were
being read incorrectly and also applied incorrectly at times. This
code is slower and not necessary so reverting it.
Each iteration of the loop was incrementing the position by 4 incorrectly.
The position should start at four since the header is 4 bytes. This
caused tombstones at the end of the file to not be read because the counter
was out of sync with the actual file position which cause the loop to exit early.
Probably better to refactor this to check for io.EOF instead of using the counter.
The code for parsing a key our of the WAL or TSM files in the engine
was naive and didn't account for measurements with escape chars. This
uses the correct parsing code to parse and load them correctly.
Fixes#6496
This remove the dropMeta param from the tsdb.Store.DeleteSeries and
lets the shard determine when to remove the meta data from the index
based on what series still have data in the shard.
This uncovered a nasty bug in compactions where a fully deleted series would
prematurely end the compactions and not carry forward the rest of the data
in the TSM file. This is now fixed as well.
There are two TSMIndex implementations, the directIndex and the
indirectIndex. Originally, we only had the directIndex and later
added the indirectIndex and NewTSMReaderWithOptions in order to
allow both indexes to be used in tests and code. This has created
a problem since we really only use the directIndex for writing and
always use the indirectIndex for reading.
This changes removes the NewTSMReaderWithOptions func so that it is
no longer possible to create a TSMReader with a directIndex. This
will allow a lot of the block reading code used by the directIndex
to be removed and simplify maintainence. It also gives better test
coverage of the code that is actually used by the TSM engine now.
This adds support for a time range to tombstone files to allow a subset
of points to be deleted instead of the whole series. It changes the
tombstone file format to a binary format and maintains backwards compatibility
with the old text format tombstone files.
This commit changes the `FloatDecoder.val` from a `float64` type
to a `uint64` to avoid an additional type conversion during read.
Now the type gets converted to a `float64` only on call to `Values()`.
This has various benefits:
- Users embedding InfluxDB within other Go programs can specify a different logger / prefix easily.
- More consistent with code used elsewhere in InfluxDB (e.g. services, other `run.Server.*` fields, etc).
- This is also more efficient, because it means `executeQuery` no longer allocates a single `*log.Logger` each time it is called.
The cache max memory size is an approximate size and can prevent a
shard from loading at startup. This change disable the max size
at startup to prevent this problem and sets the limt back after
reloading.
Fixes#6109
This also switches the remaining iterators to be lazy so they can return
errors properly. They needed to be converted to lazy initialization
anyway, which has the side effect of making it much easier for us to
propagate the underlying error during initialization.
Updated the Emitter to return errors when it cannot read properly from
the iterators.