* only call ParseTags when necessary
* remove dependency on inmem.Series in tsdb test package
* Measurement and Series are no longer exported. Their use is restricted
to the inmem package
* improve Measurement and Series types by exporting immutable
fields and removing unnecessary APIs and locks
Reduced startup time from 28s to 17s. Overall improvement including
#9162 reduces startup from 46s to 17s for 1MM series across 14 shards.
This commit ensures that the series file should work appropriately on
32-bit architecturs. It does this by reducing the maximum size of a
series file to 512MB on 32-bit systems, which should be fully
addressable.
It further updates tests so that the series file size can be reduced
further when running many tests in parallel on 32-bit architectures.
This commit firstly ensures that a shard's size on disk is accurately
reported when using the tsi1 index, by including the on-disk size of the
tsi1 index in the calculation.
Secondly, this commit add support for shard streaming/copying when using
the tsi1 index. Prior to this, a tsi1 index would not be correctly
restored when streaming shards.
The DropSeries code path ended up creating a MeasurementSeriesIterator
for each dropped series, this was too expensive just to see if a
series exists.
This adds a HasSeries func and fixes and issue where TSI files were
compacted while an iterator was still in use causing a panic.
The previous sha was taken from a revision on a devel branch that I
thought would continue staying in the tree after it was merged. That
revision was rebased away and the API was changed for the logger.
This updates the usage of the logger and adds a simple package for
constructing the base logger.
The 1.0 version of zap changed the format of the default console logger
so this change moves over to this new logger instead of attempting to
retain backwards compatibility with the old format.
This commit carries out the initial refactor of the tsi1.Index into
tsi1.Partition. We then create a new tsi1.Index that will be an
abstraction over a collection of Partitions.
Fixes#8989 and #8633.
Previously when issuing commands involving a regex check, walking
through the tags keys/values on a measurement, using the measurement's
index, would be racy.
This commit adds a new `TagKeyValue` type that abstracts away the
multi-layer map we were using as an inverted index from tag keys and
values to series ids. With this abstraction we can also make concurrent
access to this inverted index goroutine safe.
Finally, this commit fixes a very old bug in the index which will affect
any query using a regex. Previously we would always check _every_ tag
against a regex for a measurement, even when we had found a match.
TSI did not check that the max select series limit during planning
the same way that inmem did. This means that the limit could be
set but the planning of a high cardinality query would still OOM
the server. This fixes that limit as well as makes the query interruptible
during planning.
This commit adds a basic TSI versioning scheme, by adding a Version field
to an index's MANIFEST file.
Existing TSI indexes will not have this field present in their MANIFEST
files, and thus will be deemed incomatible with the current version.
Users with existing TSI indexes will be able to remove them, and convert the
resulting inmem indexes to the current version of a TSI index using the
influx_inspect tooling.
Deleting high cardinality series could take a very long time, cause
write timeouts as well as dead lock the process. This fixes these
issue to by changing the approach for cleaning up the indexes and
reducing lock contention.
The prior approach delete each series and updated every index (inmem)
during the delete. This was very slow and cause the index to be locked
while it items in a slice were removed one by one. This has been changed
to mark series as deleted and then rebuild the index asynchronously which
speeds up the process.
There was also a dead lock that could occur when deleing the field set.
Deleting the field set held a write lock and the function it invoked under
the lock could try to take a read lock on the field set. This would then
deadlock. This approach was also very slow and caused time out for writes.
It now uses faster approach that checks for the existing of the measurment
in the cache and filestore which does not take write locks.
There are several places in the code where comma-ok map retrieval was
being used poorly. Some were benign, like checking existence before
issuing an unconditional delete with no cleanup. Others were potentially
far more serious: assuming that if 'ok' was true, then the resulting
pointer retrieved from the map would be non-nil. `nil` is a perfectly
valid value to store in a map of pointers, and the comma-ok syntax is
meant for when membership is distinct from having a non-zero value.
There was only one or two cases that I saw that being used correctly for
maps of pointers.
This change provides a clear separation between the query engine
mechanics and the query language so that the language can be parsed and
dealt with separate from the query engine itself.
The sortedSeriesIds slice was not getting reset to 0 which caused
the same series ids to exist in the slice more than once. Since
the size of the slice never matched the size of the seriesID map,
it kept appendending to the slice and sorting it which cause multiple
cursor to get created for the same series.
Fixes#8531
Currently two write locks in `inmem` are obtained and then
manually unlocked at function exit points. However, we have
reports that the `inmem` index is hanging on a write lock and
cannot track the issue down to anything else besides a lock
that could have been left unlocked because of a panic.
This commit changes the two locks to always defer their unlocks
to prevent these hangs.
This fixes the case where log files are compacted out of order
and cause non-contiguous sets of index files to be compacted.
Previously, the compaction planner would fetch a list of index files
for each level and compact them in order starting with the oldest
ones. This can be a problem for level 1 because level 0 (log files)
are compacted individually and in some cases a log file can finish
compacting before older log files are finished compacting. This
causes there to be a gap in the list of level 1 files that is
ignored when fetching a list of index files.
Now, the planner reads the list of index files starting from the
oldest but stops once it hits a log file. This prevents that gap
from being ignored.
This check was previously in a different section of code which
was lost during a refactor to the new compaction strategy. The
compaction planning now makes a check to ensure at least two
files are available for compaction in a level.
Measurement name and field were converted between []byte and string
repetively causing lots of garbage. This switches the code to use
[]byte in the write path.
Index.ForEachMeasurementTagKey held an RLock while call the fn,
if the fn made another call into the index which acquired an RLock
and after another goroutine tried to acquire a Lock, it would deadlock.
Under high write load, the check for each series was done sequentially
which caused a lot of CPU time to acquire/release the RLock on LogFile.
This switches the code to check multiple series at once under an RLock
similar to the chang for inmem.
The inmem index would call CreateSeriesIfNotExist for each series
which takes and releases and RLock to see if a series exists. Under
high write load, the lock shows up in profiles quite a bit. This
adds a filtering step that obtains a single RLock and checks all the
series and returns the non-existent series to contine though the slow
path.