Commit Graph

507 Commits (dd34f5fd9d43c87f690baba0927642f1e76befe6)

Author SHA1 Message Date
Ben Johnson 10a2063dcc
fix(tsdb): Fix tsm1 block merge.
Fixes the `tsm1.BlockIterator` so that it returns the current
key if there are still additional entries remaining. This previously
caused multiple entries not to be merged together during compaction
because the iterator would check if the next key matched the current
key but the key for the next set of entries was returned.
2019-07-03 10:08:51 -06:00
Stuart Carnie a42ff1628d
fix(influxd): --pattern flag matches specified substring
Previously the logic was inverted so `--pattern` matched
everything but the specified value.
2019-07-03 12:02:19 +10:00
Max U af257e93ff initial commit for clearing tsm files when replace fails 2019-07-02 13:48:31 -04:00
Ben Johnson 08e24faf4c
feat(tsdb): Add block exporter.
Adds export tooling to `influxd inspect export-blocks` so that we
can dump out block data in SQL format for better analysis during
the debugging process.
2019-07-01 10:10:52 -06:00
Tanya Gordeeva fe4333e8e0 fix(storage): fix tracking disk bytes in memory 2019-06-27 16:36:00 -07:00
Tanya Gordeeva 3ff15a8b41 fix(storage): fix counts for level 4+ files
The counts wreen't adding all the level 4+ files, so the last one to be counted
would override the rest.
2019-06-27 16:36:00 -07:00
Ben Johnson b3d7986d4b
chore(tsdb): Fix read metrics declaration. 2019-06-27 09:25:27 -06:00
Ben Johnson 12549c859e
feat(tsdb): Add basic tsdb read metrics
Adds a total cursor counter and seek location counter to a new
`readMetrics` that is added to each `Engine`. Default labels group
by `engine_id` and `node_id`.
2019-06-26 16:16:24 -06:00
tmgordeeva fb69c5d06c
Merge pull request #13698 from influxdata/tg-fix-metrics
fix(storage): reduce tsm level metrics cardinality
2019-06-20 17:57:37 -07:00
Tanya Gordeeva 6428cdbce6 fix(storage): initialize tsm file metrics, update after compaction
These metrics weren't being properly intialized on opening the file store, and
weren't being properly updated on compaction.
2019-06-20 14:37:53 -07:00
Tanya Gordeeva 85dc52a93b fix(storage): reduce tsm level metrics cardinality
This should have cut off TSM file levels at 4+.
2019-06-20 14:37:33 -07:00
Ben Johnson 14980d55b8
fix(storage): Add WithCurrentGenerationFunc() for generation injection.
Adds the ability to set the current generation to use when compacting
the cache only. Previously, we used the current generation for all
files but this causes issues and we should only use the current
generation for level 1 compaction.
2019-06-20 08:54:38 -06:00
Ben Johnson a181e60d70
fix(tsdb): Fix series file count (#13770)
fix(tsdb): Fix series file count
2019-06-11 10:07:12 -06:00
Christopher Wolff a82e2cb180 chore(tsdb): skip flaky test 2019-05-30 16:29:31 -07:00
Alirie Gray 576da8f9d2 fix(swagger): add log property to task runs endpoint docs 2019-05-17 14:08:10 -07:00
Nathaniel Cook faa5fddf7b Merge branch 'master' into flux-staging 2019-05-15 10:12:14 -06:00
Christopher Wolff 52a98aae2b chore(tsdb): skip flaky test
https://github.com/influxdata/influxdb/issues/13755
2019-05-14 12:52:37 -07:00
Jacob Marble 95f28cb571
fix(series file): Sync series segment after truncate (#13836) (#13859) 2019-05-10 11:25:43 -07:00
Jacob Marble aa5c77409d
backport: Fix open/close race in SeriesFile (#13837) 2019-05-08 11:39:24 -07:00
zhulongcheng fbd6e9f5c4 fix(tsm1): check if blocks are overlapping in KeyCursor 2019-05-04 22:25:09 +08:00
Edd Robinson 3588c0505e fix(storage): don't remap renamed TSM file
There exists a possibility for an in-flight read on a TSMReader to read
a stale reference to an mmapped TSM file index, which has become
unmapped.

This commit resolves that issue by simply renaming the file, leaving the
original file handler open and the data mapped. The path is updated so
that if any callers need to refer to the name of the TSM file after it's
renamed, the new name will be reflected.

The orphaned file handler will be closed when the TSM file is closed.
2019-05-03 22:36:35 +01:00
Ben Johnson a5ccf5ce9a
fix(tsdb): Fix series file count
Previously the series file did not include tombstones in the total
count. This commit now includes tombstones in the count as well as
fixes an issue where replayed tombstone records could exist but
their underlying ID did not exist. This caused the count to become
negative and with the count being `uint64` it caused the count to
rollover to `math.Uint64Max`.
2019-05-03 09:58:13 -06:00
Jeff Wendling ef0768db31
tsm1: predicate deletes (#13371)
tsm1: predicate deletes
2019-05-03 14:27:25 +00:00
Stuart Carnie bf774b66ce
fix(storage): Ensure Tag(Keys|Values) APIs never return (nil, nil)
Formalized this post condition in the documentation and added additional
unit tests.

Added a nil guard and unit test to WriteStringIterator.
2019-05-02 09:45:38 -07:00
Jeff Wendling 16e9eb4cb9 tsdb: respond to feedback and improve test coverage
predicate.go:
	UnmarshalPredicate       100.0%
	NewProtobufPredicate     100.0%
	Matches                  100.0%
	Marshal                  100.0%
	walkPredicateNodes       100.0%
	buildPredicateNode       100.0%
	newPredicateState        100.0%
	Reset                    100.0%
	Set                      100.0%
	newPredicateCache        100.0%
	Cached                   100.0%
	Store                    100.0%
	Update                   100.0%
	Update                   92.9%
	Update                   94.1%
	predicateEval            90.9%
	predicatePopTag          100.0%
	predicatePopTagEscape    100.0%
2019-05-01 13:40:40 -06:00
Jeff Wendling 4b4a814d7d storage: fix predicate matching on field tags 2019-05-01 13:40:40 -06:00
Jeff Wendling 740d669514 tsm1: teach the cache about predicates 2019-05-01 13:40:40 -06:00
Jeff Wendling 4fb7bf1730 tsm1: implement predicate matcher from protobufs 2019-05-01 13:40:40 -06:00
Jeff Wendling 4096f93891 tsm1: implement reading and writing predicates in tombstone files 2019-05-01 13:40:40 -06:00
Jeff Wendling dcf797f111 tsm1: basic predicate implementation at index layer
Only wires it up. No tests, no tombstone tracking, nothing.
2019-05-01 13:40:40 -06:00
Jeff Wendling 7403fd8aa9 tsm1: rename engine method to DeletePrefixRange
The storage/engine knows about buckets, but the tsm1/engine doesn't, so
name the tsm1/engine method Prefix and keep the storage/engine named
Bucket.
2019-05-01 13:40:40 -06:00
Jacob Marble 8c269e0153
chore(log): Put trace_id back in logs (#13712)
* chore(log): Put trace_id back in logs

* fix tests
2019-04-30 18:51:22 -07:00
Stuart Carnie 65e4e3c5de
Merge pull request #13701 from influxdata/sgc/bp/2.x/13687
Don't panic when encoding string blocks and length of source slice is too large
2019-04-30 10:02:40 -07:00
Stuart Carnie 369a4610e6
fix(storage): Don't panic when length of source slice is too large
StringArrayEncodeAll will panic if the total length of strings
contained in the src slice is > 0xffffffff. This change adds a unit
test to replicate the issue and an associated fix to return an error.

This also raises an issue that compactions will be unable to make
progress under the following condition:

* multiple string blocks are to be merged to a single block and
* the total length of all strings exceeds the maximum block size that
  snappy will encode (0xffffffff)

The observable effect of this is errors in the logs indicating a
compaction failure.

Fixes #13687
2019-04-29 13:29:41 -07:00
Jeff Wendling 9cd7c0f7e3 tsi1: don't do verbose debug logging unless test fails 2019-04-29 14:01:45 -06:00
Stuart Carnie 7b97a41dcb
feat(storage): Teach TagKeys, TagValues how to accumulate statistics
This commit teaches the storage schema APIs how to track statistics
and make them available via the returned `cursors.StringIterator`.

Statistics are only tracked when decoding TSM blocks or when scanning
the in-memory cache.

Closes #13541
2019-04-24 11:14:22 -07:00
Stuart Carnie ed344d25f8
feat(storage): Teach storage how to find a distinct set of tag keys
The TagValues API will perform a linear scan if there is no predicate;
otherwise, it will use the index to find a list of candidate series
keys.

TagKeys expects the predicate to be transformed such that
`_measurement` and `_field` are remapped to `\x00` and `\xff`
respectively.

There is one TODO marked to analyze the predicate for a
`\x00 = '<measurement>'` pattern. If found, the predicate can be
eliminated and fall back to a linear prefix scan by combining the org,
bucket and measurement. This is tracked by issue #13497.
2019-04-24 11:14:22 -07:00
Ben Johnson 272f340c30
Merge point parse & explode. 2019-04-24 10:12:15 -06:00
Jeff Wendling 59279837e5 tsi1: partition close deadlock
When a tsi1 partition closes, it waits on the wait group for compactions
and then acquires the lock. Unfortunately, a compaction may start in the
mean time, holding on to some resources. Then, close will attempt to
close those resources while holding the lock. That will block until
the compaction has finished, but it also needs to acquire the lock
in order to finish, leading to deadlock.

One cannot just move the wait group wait into the lock because, once
again, the compaction must acquire the lock before finishing. Compaction
can't finish before acquiring the lock because then it might be operating
on an invalid resource.

This change splits the locks into two: one to protect just against
concurrent Open and Close calls, and one to protect all of the other
state. We then just close the partition, acquire the lock, then free
the resources. Starting a compaction requires acquiring a resource
to the partition itself, so that it can't start one after it has
started closing.

This change also introduces a cancellation channel into a reference
to a resource that is closed when the resource is being closed, allowing
processes that have acquired a reference to clean up quicker if someone
is trying to close the resource.
2019-04-22 09:06:32 -06:00
Tanya Gordeeva 97572ee878 feat(storage): add tsm level metrics
Adds prometheus metrics recording compaction levels for TSM files.
2019-04-19 13:33:52 -07:00
Stuart Carnie d5341a1a4a
feedback: Fix comments in template 2019-04-18 16:19:19 -07:00
Stuart Carnie 972cda1775
feedback: Changes in response to PR feedback 2019-04-18 16:19:18 -07:00
Stuart Carnie 904c91aecc
chore: Fix staticcheck complaints 2019-04-18 16:19:18 -07:00
Stuart Carnie d3790aa072
feat: Teach storage engine how to find tag values for a given key
The TagValues API will perform a linear scan if there is no predicate;
otherwise, it will use the index to find a list of candidate series
keys.

TagValues expects the predicate to be transformed such that
`_measurement` and `_field` are remapped to `\x00` and `\xff`
respectively.

There is one TODO marked to analyze the predicate for a
`\x00 = '<measurement>'` pattern. If found, the predicate can be
eliminated and fall back to a linear prefix scan by combining the org,
bucket and measurement.
2019-04-18 16:19:18 -07:00
Stuart Carnie 35e0094a28
feat: TimeRangeIterator for checking if keys have data in a TSM file
The TimeRangeIterator permits linear or random index scans and
can answer whether the current key has data for the specified time
interval, considering any tombstones.

When there are no tombstones there are some opportunities for
optimization to skip decoding blocks. Specifically, if the
queried time interval overlaps any boundaries of the TSM index entries.
2019-04-18 16:19:18 -07:00
Stuart Carnie 7544ea0a5b
feat: Teach Values how to determine it contains data for a time interval
Add a Contains API which is a peer to the TimestampArray.Contains
function. This is used by the schema APIs to determine if data exists
in the cache for a given key and time interval.
2019-04-18 16:19:18 -07:00
Stuart Carnie 1ddd0445d8
feat(tsm1): Add Seek API to TSMIndexIterator
Permits random access of the iterator, correctly maintaining state,
so that Next may be called to iterator from a given key.

This API will be used by the schema APIs when a predicate is specified,
typically requiring random access.
2019-04-18 16:19:18 -07:00
Stuart Carnie 36a33bcb9f
feat(tsdb): Teach storage how to only decode timestamps from a block
TimestampArray.Contains(min,max) API performs a binary search to
determine if timestamps exist for the given time interval.

It also implements Exclude to drop timestamps that have been tombstoned.

DecodeTimestampArrayBlock decodes only the timestamps of the provided
block.
2019-04-18 16:19:18 -07:00
Stuart Carnie 7fc9661b7b
chore: Move StringIterator to cursors package for wider reuse 2019-04-18 16:19:17 -07:00
Stuart Carnie e74f2f8e08
chore(cursors): Remove unused field 2019-04-18 16:19:17 -07:00
Stuart Carnie d67b1ef245
fix(cursors): Add go:generate directive 2019-04-18 16:19:17 -07:00
Todd Persen 138c17f22c Fix typos in tsdb package 2019-04-17 12:55:38 -07:00
Ben Johnson 2b3ce82852
fix(tsdb): Remove TSI stats file cache
Removes the `STATS` file generated during TSI compaction as it had
potential for becoming inconsistent with the index data. Instead,
stats are recalculated on start up and on each compaction on a
per-partition basis.

Computing stats for 10M series across 10K measurements takes
approximately 0.171s.
2019-04-17 09:34:32 -06:00
Jacob Marble f56c42794b
chore(tracing): Cleanup (#13296)
* chore(tracing): Cleanup

* broken test

* fix unused var

* fix test
2019-04-10 19:28:21 -07:00
Ben Johnson 307bb6af9c
Improve bulk series file writes. 2019-04-05 14:38:58 -06:00
Jeff Wendling 96a01eecf2 change an inaccurate comment 2019-03-30 10:24:15 -06:00
Jeff Wendling cbefaeb7f5 tsm1: make cache limit error a type
This makes it easier and more robust to check if an error is due
to the cache memory limit being exceeded.
2019-03-30 10:24:15 -06:00
Jeff Wendling 647deb475c tsm1: move cache entry to its own file 2019-03-30 10:24:15 -06:00
Jeff Wendling fad1e07d1d tsm1: clean up some dead/useless code in the cache
The storer interface isn't necessary if the init/Free logic is
removed, which is unnecessary in a world with only one shard.
Additionally, there were some cases where an init/Free call could
race and cause data loss in the cache. Not doing it at all fixes
all of those races.
2019-03-30 10:24:15 -06:00
Jeff Wendling 591e94dad9 tsm1: rings are fixed at 16 partitions
The code actually didn't work if 16 wasn't passed. Indeed, the
benchmarks weren't even working. Fix up all that, and reduce
the complexity some.
2019-03-30 10:24:15 -06:00
Jonas Hahnfeld 89ced057cb Fix compaction logic on infrequent cache snapshots
This change fixes #10511 that manifests when a shard is considered cold
faster than its cache is snapshotted. Previously the code only looked at
the last modification of compacted tsm1 files. Instead the (restored)
Engine.lastModified() also takes the cache into account.

Ports #10522 to master where engine.go has moved and Engine.LastModified()
was deleted because it was unused.
2019-03-28 22:21:59 +01:00
Edd Robinson 9a42202b53 PR feedback 2019-03-26 09:57:01 +00:00
Edd Robinson aa4e652e43 Add reason to total compaction metric
This commit adds a reason label to the total compaction metric. For
snapshots, the reason will indicate why the cache was snapshotted. For
other compactions, the reason label will be blank.
2019-03-25 15:25:03 +00:00
Edd Robinson dbca30dac5 Add integration tests for cache snapshotting 2019-03-25 11:44:01 +00:00
Edd Robinson 55e9ed689f Allow the tsm1.Cache to be snapshotted due to age
This commit adds a new Cache option, via the
`tsm1.CacheConfig.SnapshotAgeDuration` field, which controls the maximum
age the cache can reach before it is snapshotted to a TSM file.

The default value for this option is `0`, which means that the cache
will never be snapshotted based only on age. Setting this value to, for
example, 10 seconds, would result in the cache snapshotting every 10
seconds.

Snapshotting the cache more frequently can provide better durability
guarantees in some circumstances, though more, smaller TSM files will
lead to more work needed to compact them down to larger, more dense
files.

When using InfluxDB with a WAL there isn't really a strong reason to
alter `tsm1.CacheConfig.SnapshotAgeDuration` from `0`.
2019-03-25 11:44:01 +00:00
Edd Robinson af3f7bc9cb Add new cache configuration value 2019-03-25 11:44:01 +00:00
Edd Robinson 4022db03c2 Provide explicit cache snapshot reasons 2019-03-25 11:44:01 +00:00
Edd Robinson c4cc3ca7bc Fix 2019-03-19 15:12:35 +00:00
Edd Robinson f383ec9225 Add ability to use report-tsm programmatically 2019-03-19 14:29:25 +00:00
Edd Robinson 3b39832ba5 Reduce garbage 2019-03-19 14:28:51 +00:00
Edd Robinson a6447b6ca5 Refactor tsm report for 2.0 2019-03-19 14:25:53 +00:00
Edd Robinson fdae1ae5ea Expose field key sep 2019-03-19 14:25:53 +00:00
zhulongcheng a33c325890 storage: pr review changes 2019-03-12 22:15:28 +08:00
zhulongcheng 2554f1c5dd storage: add SeriesOffsetSize constant 2019-03-12 10:51:22 +08:00
Jacob Marble 603a1f26e0 use tracing.StartSpanFromContext 2019-03-07 12:12:31 -07:00
Edd Robinson 582ed6834c ddress PR feedback 2019-03-07 09:56:07 +00:00
Edd Robinson 1cb20b654d ExplodePoints now complies with new keys 2019-03-07 09:56:07 +00:00
Edd Robinson f029f1645d Change location and value for internal tag keys 2019-03-07 09:56:07 +00:00
Jeff Wendling f53f9cd949 storage: detect conflicting types in a single batch of points
When the WAL was moved up, the validation that happened at the cache
was skipped. This moves the field type validation for a batch of
points up ahead of the WAL again.
2019-03-06 10:30:52 -07:00
Jacob Marble b9c7ec439e
feat(influxd): Tracing refactor (#12318)
* feat(launcher): Tracing to log disabled by default

* remove traceLogger and use opentracing directly

* add Jaeger tracing

* go vet && go fmt
2019-03-04 11:48:11 -08:00
Jeff Wendling 052421d5d6
Merge pull request #12207 from influxdata/jmw-fix-resource-ownership
storage: fix problems with keeping resources alive
2019-03-04 10:58:46 -07:00
Ben Johnson 12d35f1a50
Revert "Merge point parse & explode."
This reverts commit 1004abc3e1.
2019-03-02 06:23:04 -07:00
Ben Johnson 1004abc3e1
Merge point parse & explode. 2019-03-01 15:55:37 -07:00
Jeff Wendling ef425b7bf9 tsdb: fix disabling metrics in the series index
During Recover, we forgot to propagate the disabled flag to the
keyIDMap options like we do during Open. Since we still do propagate
the singleton `ims` which is initialized lazily, if the first
initialization has a different set of labels, it will cause an
inconsistent usage even if the metrics are disabled.
2019-03-01 12:11:16 -07:00
Jeff Wendling 0fae44e219 storage: fix problems with keeping resources alive
This commit adds the pkg/lifecycle.Resource to help manage opening,
closing, and leasing out references to some resource. A resource
cannot be closed until all acquired references have been released.
If the debug_ref tag is enabled, all resource acquisitions keep
track of the stack trace that created them and have a finalizer
associated with them to print on stderr if they are leaked. It also
registers a handler on SIGUSR2 to dump all of the currently live
resources.

Having resources tracked in a uniform way with a data type allows us
to do more sophisticated tracking with the debug_ref tag, as well.
For example, we could panic the process if a resource cannot be
closed within a certain time frame, or attempt to figure out the
DAG of resource ownership dynamically.

This commit also fixes many issues around resources, correctness
during error scenarios, reporting of errors, idempotency of
close, tracking of memory for some data structures, resource leaks
in tests, and out of order dependency closes in tests.
2019-02-28 10:22:01 -07:00
Jacob Marble 4e5253d581
Feat/add zeros to tsm filename (#12174)
* unit tests to confirm The Old Way®

* feat: Increase TSM generation max value to 1 trillion
2019-02-27 14:59:38 -08:00
zhulongcheng 4f9f85de84 tsdb: cleanup shard errors 2019-02-18 21:25:30 +08:00
Jeff Wendling 26ca30e97a Ensure that cached series id sets are Go heap backed 2019-02-12 16:33:35 -07:00
Ben Johnson cf29b6bca4
Convert TagValueSeriesIDCache to use string fields 2019-02-12 14:45:38 -07:00
Edd Robinson 0858b2570d Rename --> RenameFileWithReplacement for clarity 2019-02-12 12:41:10 +00:00
Edd Robinson bd8a167a3e Rename file package to fs 2019-02-12 11:24:11 +00:00
Jeff Wendling 3bb765279b storage: respond to review comments 2019-02-04 12:26:26 -07:00
Jeff Wendling b4823d11bf storage: double check the cache to avoid deleting keys that still exist 2019-02-04 10:58:17 -07:00
Jeff Wendling 3014733b20 chore: fix staticcheck issues 2019-02-04 10:32:52 -07:00
Jeff Wendling a424bf3e4c tsm1: implement DeleteBucketRange for the Cache 2019-02-04 10:32:52 -07:00
Jeff Wendling 376b347d56 wal: change deletes to be based on DeleteBucket 2019-02-04 10:32:52 -07:00
Jeff Wendling 7f54e816e3 refactor: have retention use DeleteBucketRange 2019-02-04 10:32:52 -07:00
Jeff Wendling aa12144fc7 storage: replay the WAL through the whole engine 2019-02-04 10:32:52 -07:00
Jeff Wendling 6deced1215 refactor: make the WAL part of snapshots again 2019-02-04 10:32:52 -07:00
Jeff Wendling 2989936d5a refactor: write to the WAL again 2019-02-04 10:32:52 -07:00
Jeff Wendling a3e66755ca refactor: move value aliases into its own file 2019-02-04 10:32:52 -07:00
Jeff Wendling 2f46937527 refactor: move value package up to tsdb 2019-02-04 10:32:52 -07:00
Jeff Wendling d2ddd48eea refactor: hook up metrics and wal to storage engine
It turns out that LastModified and DiskSize are unused, and so it
was easy to change to not care about the WAL.

This hooks up metrics and starts the WAL again.
2019-02-04 10:32:52 -07:00
Jeff Wendling 95de3d52b2 refactor: use concrete WAL in tsm1
At the cost of some nil checks, we don't have to have an interface, defend against
subtle bugs with nils in non-nil interfaces, an empty implementation, etc.

Also, the tsm1 engine is losing the WAL anyway.
2019-02-04 10:32:52 -07:00
Jeff Wendling c9bb55b889 refactor: move the tsm1/wal into the storage/wal package
Because the WAL relies on the tsm1.Value type, we move that into its own
tsm1/value package and set up some aliases forwarding them into tsm1. This
also required adding some methods and changing consumers to avoid the
unexported fields. I imagine this step will be useful one day when we make
the write path more efficient with respect to consuming points.

This commit additionally fixes some issues with generation. The iterator.tmpldata
and generation for array_cursor_* were removed accidentally when removing
iterators, making those generated files stale. Restore that and regenerate.

No change in functionality.
2019-02-04 10:32:52 -07:00
Chris Goller e5d773cee3
Merge pull request #11492 from asashour/typos
Fix typo
2019-02-04 08:33:41 -06:00
Edd Robinson 1188d75a99
Merge pull request #11202 from influxdata/er-tsi-times
Add skeleton TSI design doc
2019-01-28 11:54:59 -08:00
Edd Robinson 19a36e0dc7 Remove copy-on-write when caching bitmaps
In the case of caching TSI bitmaps belonging to immutable .tsi files,
the underlying bitset data can be mmapped. It is possible, though rare,
for this data to be unmapped (e.g., via a TSI compaction) but for the
cached bitmap to be subsequently read. This leads to a segfault.

This only happens when copy-on-write is set to true on the roaring
bitmap, because in that case only the internal pointers are cloned.

This change will reduce the TSI cache performance by around 10%, which I
have deemed to account for only a few microseconds typically.
2019-01-25 13:38:22 +00:00
Ahmed Ashour 0d1d2c841e Fix typo 2019-01-23 13:33:09 +01:00
Edd Robinson 07b8eacf34 Fix bucket delete for all buckets
If a bucket had bytes in it that would be escaped by the models
parser/package, then the index would not be correctly purged of those
series data when the bucket was dropped.
2019-01-18 17:28:58 +00:00
Edd Robinson 045bb64c5e Add skeleton TSI design doc 2019-01-17 12:22:08 +00:00
Edd Robinson 810b5d9281 Bulk log file delete
This commit adds a method to delete many series ids from the LogFile in
bulk, reducing the number of fsyncs required.
2019-01-15 11:45:12 +00:00
Edd Robinson 9ff65f6016 Track deleted series ids to remove from series file
Previously series that were being removed were tracked at the key level.
This means that when removing them from the series file, the series id
first had to be looked up. This can cause lock thrashing when there are
many series ids to look up (such as with a bulk delete), because there
are no bulk methods to do this.

This commit changes how the series file delete is done by extracting
the series ids from the index before we remove the index entries. It's
then possible to delete all those series ids from the series file
without having to lookup the ids.
2019-01-15 11:45:10 +00:00
Edd Robinson b025d9afa9 Improve efficiency of TSI index series drop
This commit improves the performance of a mass delete on the TSI index
by deleting at the measurement level instead of deleting each series
individually.
2019-01-14 12:46:55 +00:00
Edd Robinson 7ee4f499e1 Clarify best method of set difference 2019-01-14 12:46:53 +00:00
Edd Robinson c7d26d8950 Rename delete method 2019-01-14 11:23:13 +00:00
Edd Robinson 20a8528337 Ensure TSI bitset cache cleaned up on m drop 2019-01-14 11:23:13 +00:00
Mark Rushakoff d73d73c0d4 chore: rename imports from platform to influxdb
I did this with a dumb editor macro, so some comments changed too.

Also rename root package from platform to influxdb.

In interest of minimizing risk, anyone importing the root package has
now aliased it to "platform" so that no changes beyond imports were
necessary in those files.

Lastly, replace the old platform module to local path /dev/null so that
nobody can accidentally reintroduce a platform dependency while
migrating platform code to influxdb.
2019-01-09 20:51:47 -08:00
Jeff Wendling 703c3c15ca Hook up DeleteBucket to the tsm1 engine 2019-01-09 15:24:26 -07:00
Jeff Wendling b5bfb836c0 tsm1: remove unsafe in prefixTree 2019-01-09 12:43:01 -07:00
Jeff Wendling e503ef40d1 tsm1: add comments responding to review feedback 2019-01-09 11:35:06 -07:00
Jeff Wendling 73c0ea410e tsm1: add test for engine DeletePrefix 2019-01-09 10:56:10 -07:00
Jeff Wendling 0a85e3b0dd tsm1: add initial index cleanup to DeletePrefix 2019-01-08 16:32:43 -07:00
Jeff Wendling 0fe2f02812 tsm1: initial DeletePrefix impl 2019-01-08 16:03:34 -07:00
Jeff Wendling f712828016 tsm1: refactor and rename some methods 2019-01-08 14:52:30 -07:00
Jeff Wendling 8744a82665 tsm1: add DeletePrefix to the reader 2019-01-07 21:11:49 -07:00
Jeff Wendling f65b0933f6 tsm1: move code around into smaller files and add tests 2019-01-07 21:11:49 -07:00
Jeff Wendling fed3154506 tsm1: DeletePrefix on the indirectIndex 2019-01-07 21:08:32 -07:00
Jeff Wendling ad5352926f tsm1: log when error reading entries for tsm key 2019-01-07 11:00:35 -07:00
Jeff Wendling 9cdefa8e4f tsm1: fix staticcheck and refactor closure out 2019-01-07 11:00:35 -07:00
Jeff Wendling 1ffcd77342 tsm1: fix remaining issues and add small benchmarks
- notice when keys are deleted during iteration and return an error
- make sure all the consumers check the error
- add some benchmarks for small indexes to compare
- allow concurrent readers to flag deletes

benchmarks against base:

name                                           old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8                  70.0ms ±17%    71.0ms ±12%      ~     (p=1.000 n=8+8)
IndirectIndex_DeleteRangeLast-8                  1.48µs ± 1%    0.28µs ± 5%   -81.29%  (p=0.000 n=8+7)
IndirectIndex_DeleteRangeFull/Large-8             786ms ± 1%     363ms ± 3%   -53.89%  (p=0.000 n=7+8)
IndirectIndex_DeleteRangeFull/Small-8            2.37ms ± 0%    1.14ms ± 3%   -52.02%  (p=0.000 n=7+8)
IndirectIndex_DeleteRangeFull_Covered/Large-8     384ms ± 2%     188ms ± 3%   -51.04%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull_Covered/Small-8     470µs ± 1%     190µs ± 1%   -59.71%  (p=0.000 n=8+7)
IndirectIndex_Delete/Large-8                     74.0ms ± 1%   128.7ms ± 1%   +73.80%  (p=0.001 n=7+7)
IndirectIndex_Delete/Small-8                      142µs ± 1%     130µs ± 1%    -8.24%  (p=0.000 n=8+8)

name                                           old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8                  11.6MB ± 0%    11.7MB ± 0%    +0.02%  (p=0.000 n=8+7)
IndirectIndex_DeleteRangeLast-8                  3.26kB ± 0%   0.00kB ±NaN%  -100.00%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Large-8             233MB ± 0%     161MB ± 0%   -30.75%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Small-8            2.13MB ± 0%    1.40MB ± 0%   -34.53%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull_Covered/Large-8    12.4MB ± 0%     0.4MB ± 0%   -96.82%  (p=0.002 n=7+8)
IndirectIndex_DeleteRangeFull_Covered/Small-8     120kB ± 0%       0kB ± 0%   -99.89%  (p=0.000 n=8+8)
IndirectIndex_Delete/Large-8                     4.54kB ± 0%    0.21kB ± 0%   -95.26%  (p=0.000 n=8+8)
IndirectIndex_Delete/Small-8                      80.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=8+8)

name                                           old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8                    35.0 ± 0%      42.0 ± 0%   +20.00%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeLast-8                    3.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Large-8             1.53M ± 0%     0.52M ± 0%   -65.98%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull/Small-8             15.2k ± 0%      5.2k ± 0%   -65.97%  (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull_Covered/Large-8       620 ± 0%       124 ± 0%   -80.00%  (p=0.002 n=7+8)
IndirectIndex_DeleteRangeFull_Covered/Small-8      10.0 ± 0%       2.0 ± 0%   -80.00%  (p=0.000 n=8+8)
IndirectIndex_Delete/Large-8                        246 ± 0%         1 ± 0%   -99.59%  (p=0.000 n=8+8)
IndirectIndex_Delete/Small-8                       4.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=8+8)
2019-01-07 11:00:35 -07:00
Jeff Wendling 14cf01911e tsm1: change TSMFile to use an iterator style api 2019-01-07 11:00:35 -07:00
Jeff Wendling 917584b054 tsm1: use readerOffsetsIterator for deletes
This reduces the amount of disk hits at some costs in cpu on some benchmarks. Notably, the
DeleteRangeFull_Covered and Delete benchmarks both went to approximately zero page faults
meaning they read from the index file linearly.

name                                     old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8            68.8ms ±10%    63.1ms ±16%   -8.28%          (p=0.021 n=8+8)
IndirectIndex_Entries-8                    9.09µs ± 3%    9.62µs ± 1%   +5.84%          (p=0.000 n=8+7)
IndirectIndex_ReadEntries-8                5.86µs ± 1%    6.15µs ± 3%   +5.03%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeLast-8             562ns ± 6%     308ns ± 2%  -45.25%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8             363ms ±10%     376ms ± 5%     ~             (p=0.054 n=8+7)
IndirectIndex_DeleteRangeFull_Covered-8     574ms ± 2%     746ms ± 0%  +30.01%          (p=0.000 n=8+7)
IndirectIndex_Delete-8                     51.2ms ± 0%    88.2ms ± 0%  +72.38%          (p=0.000 n=8+7)

name                                     old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8            11.7MB ± 0%    11.7MB ± 0%     ~     (all samples are equal)
IndirectIndex_Entries-8                    32.8kB ± 0%    32.8kB ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8            0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8             162MB ± 0%     162MB ± 0%     ~             (p=0.798 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8    82.4MB ± 0%    82.4MB ± 0%     ~             (p=0.857 n=8+8)
IndirectIndex_Delete-8                     4.01kB ± 0%    4.04kB ± 0%   +0.90%          (p=0.000 n=8+8)

name                                     old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8              42.0 ± 0%      42.0 ± 0%     ~     (all samples are equal)
IndirectIndex_Entries-8                      1.00 ± 0%      1.00 ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                 0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8             0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8              522k ± 0%      522k ± 0%     ~             (p=0.743 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8     3.31k ± 0%     3.31k ± 0%     ~             (p=0.856 n=8+8)
IndirectIndex_Delete-8                        123 ± 0%       123 ± 0%     ~     (all samples are equal)

name                                     old speed      new speed      delta
IndirectIndex_DeleteRangeFull-8          18.1MB/s ± 9%  17.5MB/s ± 7%     ~             (p=0.105 n=8+8)
IndirectIndex_Delete-8                    116MB/s ± 0%     0MB/s ± 0%  -99.96%          (p=0.000 n=8+8)
2019-01-07 11:00:35 -07:00
Jeff Wendling 6f5c94f3f7 tsm1: introduce readerOffsets to manage the offsets slice
It exposes an API that will clean up the bodies of many methods and
provide a safe abstraction around iteration that will be able to
handle reads with concurrent deletes.

Benchmarks are flat.
2019-01-07 11:00:35 -07:00
Jeff Wendling f860305124 tsm1: keep first 8 bytes of each key in memory
Since most keys will share the first 8 bytes, we collapse them into
a slice containing partial sums of the counts. We can then binary search
into that slice to find the associated prefix for a given offset index.
Compressing in this way causes the overhead to be negligable and reduces
disk misses by about 30% in these benchmarks (500k series across 100 orgs).

name                                     old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8            67.5ms ± 1%    64.6ms ± 1%   -4.33%          (p=0.000 n=8+7)
IndirectIndex_Entries-8                    9.41µs ± 2%    9.39µs ± 1%     ~             (p=0.959 n=8+8)
IndirectIndex_ReadEntries-8                5.99µs ± 1%    6.07µs ± 1%   +1.29%          (p=0.001 n=8+8)
IndirectIndex_DeleteRangeLast-8             369ns ± 2%     566ns ± 1%  +53.37%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8             368ms ± 9%     369ms ± 2%     ~             (p=0.232 n=8+7)
IndirectIndex_DeleteRangeFull_Covered-8     600ms ± 1%     618ms ± 0%   +3.03%          (p=0.000 n=8+7)
IndirectIndex_Delete-8                     50.0ms ± 1%    47.6ms ± 9%     ~             (p=0.463 n=7+8)

name                                     old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8            11.6MB ± 0%    11.7MB ± 0%   +0.02%          (p=0.000 n=8+7)
IndirectIndex_Entries-8                    32.8kB ± 0%    32.8kB ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8            0.00B ±NaN%    0.00B ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8             162MB ± 0%     162MB ± 0%     ~             (p=0.382 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8    82.4MB ± 0%    82.4MB ± 0%     ~             (p=0.776 n=8+8)
IndirectIndex_Delete-8                     4.01kB ± 0%    4.01kB ± 0%     ~     (all samples are equal)

name                                     old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8              35.0 ± 0%      42.0 ± 0%  +20.00%          (p=0.000 n=8+8)
IndirectIndex_Entries-8                      1.00 ± 0%      1.00 ± 0%     ~     (all samples are equal)
IndirectIndex_ReadEntries-8                 0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8             0.00 ±NaN%     0.00 ±NaN%     ~     (all samples are equal)
IndirectIndex_DeleteRangeFull-8              522k ± 0%      522k ± 0%     ~             (p=0.382 n=8+8)
IndirectIndex_DeleteRangeFull_Covered-8     3.31k ± 0%     3.31k ± 0%     ~             (p=0.457 n=8+8)
IndirectIndex_Delete-8                        123 ± 0%       123 ± 0%     ~     (all samples are equal)

name                                     old speed      new speed      delta
IndirectIndex_DeleteRangeFull-8          24.7MB/s ±10%  17.8MB/s ± 2%  -28.18%          (p=0.000 n=8+7)
IndirectIndex_DeleteRangeFull_Covered-8  14.2MB/s ± 1%   9.6MB/s ± 0%  -32.30%          (p=0.000 n=8+7)
IndirectIndex_Delete-8                    171MB/s ± 1%   126MB/s ±10%  -26.35%          (p=0.000 n=7+8)

IndirectIndex_DeleteRangeLast went from 17 page faults, or ~180GB/sec at 369ns/op
to zero page faults. So even though it got 50% slower, it was actually I/O bound
and no longer is.
2019-01-07 11:00:35 -07:00
Jeff Wendling 0becfc6239 tsm1: add helper to track page faults in index
Since the methods inline and dead code is eliminated, it has no runtime
overhead in the benchmarks when disabled.

benchmark                                  recorded faults
BenchmarkIndirectIndex_Entries-8           11
BenchmarkIndirectIndex_ReadEntries-8       11
BenchmarkIndirectIndex_DeleteRangeLast-8   17
BenchmarkIndirectIndex_DeleteRangeFull-8   2218
BenchmarkIndirectIndex_Delete-8            2084
2019-01-07 11:00:35 -07:00
Jeff Wendling 91e820a9d8 tsm1: fix multiple issues with DeleteRange
1. Correctly acquires locks
2. Seeks for discontiguous key ranges (like delete ["aaa", "zzz"])
3. Is precise about deleting a key when it contains no data

name                             old time/op    new time/op    delta
IndirectIndex_UnmarshalBinary-8    67.3ms ± 1%    63.2ms ±15%      ~             (p=0.463 n=7+8)
IndirectIndex_Entries-8            9.14µs ± 1%    9.01µs ± 0%    -1.40%          (p=0.004 n=8+7)
IndirectIndex_ReadEntries-8        5.83µs ± 1%    5.68µs ± 2%    -2.62%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeLast-8     283ns ± 2%     191ns ± 1%   -32.37%          (p=0.000 n=8+7)
IndirectIndex_DeleteRangeFull-8     612ms ± 1%     361ms ± 1%   -41.02%          (p=0.000 n=8+8)
IndirectIndex_Delete-8             49.0ms ± 1%    49.8ms ± 1%    +1.80%          (p=0.001 n=7+8)

name                             old alloc/op   new alloc/op   delta
IndirectIndex_UnmarshalBinary-8    11.6MB ± 0%    11.6MB ± 0%      ~     (all samples are equal)
IndirectIndex_Entries-8            32.8kB ± 0%    32.8kB ± 0%      ~     (all samples are equal)
IndirectIndex_ReadEntries-8        0.00B ±NaN%    0.00B ±NaN%      ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8     64.0B ± 0%     0.0B ±NaN%  -100.00%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8     168MB ± 0%     162MB ± 0%    -3.71%          (p=0.000 n=8+8)
IndirectIndex_Delete-8             3.94kB ± 0%    3.94kB ± 0%      ~     (all samples are equal)

name                             old allocs/op  new allocs/op  delta
IndirectIndex_UnmarshalBinary-8      35.0 ± 0%      35.0 ± 0%      ~     (all samples are equal)
IndirectIndex_Entries-8              1.00 ± 0%      1.00 ± 0%      ~     (all samples are equal)
IndirectIndex_ReadEntries-8         0.00 ±NaN%     0.00 ±NaN%      ~     (all samples are equal)
IndirectIndex_DeleteRangeLast-8      2.00 ± 0%     0.00 ±NaN%  -100.00%          (p=0.000 n=8+8)
IndirectIndex_DeleteRangeFull-8     1.04M ± 0%     0.52M ± 0%   -49.77%          (p=0.000 n=8+8)
IndirectIndex_Delete-8                123 ± 0%       123 ± 0%      ~     (all samples are equal)
2019-01-07 11:00:35 -07:00
Jeff Wendling aed17cfedd tsm1: speed up IndirectIndex benchmarks
rather than create the indirectIndex every Benchmark iteration
reuse a global one. reduces a couple benchmarks where Start/Stop timer
weren't in palce.

benchmark                                    old ns/op     new ns/op     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     67710057      69216355      +2.22%
BenchmarkIndirectIndex_Entries-8             9239          9762          +5.66%
BenchmarkIndirectIndex_ReadEntries-8         5964          5886          -1.31%
BenchmarkIndirectIndex_DeleteRangeLast-8     317           284           -10.41%
BenchmarkIndirectIndex_DeleteRangeFull-8     615346992     598392398     -2.76%
BenchmarkIndirectIndex_Delete-8              52906315      44400269      -16.08%

benchmark                                    old allocs     new allocs     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     35             35             +0.00%
BenchmarkIndirectIndex_Entries-8             1              1              +0.00%
BenchmarkIndirectIndex_ReadEntries-8         0              0              +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     2              2              +0.00%
BenchmarkIndirectIndex_DeleteRangeFull-8     1038932        1038722        -0.02%
BenchmarkIndirectIndex_Delete-8              123            123            +0.00%

benchmark                                    old bytes     new bytes     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     11648760      11648760      +0.00%
BenchmarkIndirectIndex_Entries-8             32768         32768         +0.00%
BenchmarkIndirectIndex_ReadEntries-8         1             0             -100.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     64            64            +0.00%
BenchmarkIndirectIndex_DeleteRangeFull-8     168112352     168061952     -0.03%
BenchmarkIndirectIndex_Delete-8              3936          3936          +0.00%
2019-01-07 11:00:35 -07:00
Jeff Wendling d40c3e662f tsm1: use uint32 key for tombstones
rough, noisy benchmarks.

benchmark                                    old ns/op     new ns/op     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     62462250      67710057      +8.40%
BenchmarkIndirectIndex_Entries-8             9601          9239          -3.77%
BenchmarkIndirectIndex_ReadEntries-8         5984          5964          -0.33%
BenchmarkIndirectIndex_DeleteRangeLast-8     314           317           +0.96%
BenchmarkIndirectIndex_DeleteRangeFull-8     813838165     615346992     -24.39%
BenchmarkIndirectIndex_Delete-8              52079181      52906315      +1.59%

benchmark                                    old allocs     new allocs     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     35             35             +0.00%
BenchmarkIndirectIndex_Entries-8             1              1              +0.00%
BenchmarkIndirectIndex_ReadEntries-8         0              0              +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     2              2              +0.00%
BenchmarkIndirectIndex_DeleteRangeFull-8     1532670        1038932        -32.21%
BenchmarkIndirectIndex_Delete-8              123            123            +0.00%

benchmark                                    old bytes     new bytes     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     11648760      11648760      +0.00%
BenchmarkIndirectIndex_Entries-8             32768         32768         +0.00%
BenchmarkIndirectIndex_ReadEntries-8         1             1             +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     64            64            +0.00%
BenchmarkIndirectIndex_DeleteRangeFull-8     232738960     168112352     -27.77%
BenchmarkIndirectIndex_Delete-8              3936          3936          +0.00%
2019-01-07 11:00:35 -07:00
Jeff Wendling ffd35ce1aa tsm1: use a uint32 for offsets globally
benchmarks are flat.
2019-01-07 11:00:35 -07:00
Jeff Wendling 7a7a4b6d58 tsm1: remove offsets from mmap
benchmark                                    old ns/op     new ns/op     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     74525387      66439305      -10.85%
BenchmarkIndirectIndex_Entries-8             8892          9200          +3.46%
BenchmarkIndirectIndex_ReadEntries-8         5816          5691          -2.15%
BenchmarkIndirectIndex_DeleteRangeLast-8     1550          311           -79.94%
BenchmarkIndirectIndex_DeleteRangeFull-8     773649708     767030277     -0.86%
BenchmarkIndirectIndex_Delete-8              79755991      52015903      -34.78%

benchmark                                    old allocs     new allocs     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     35             35             +0.00%
BenchmarkIndirectIndex_Entries-8             1              1              +0.00%
BenchmarkIndirectIndex_ReadEntries-8         0              0              +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     3              2              -33.33%
BenchmarkIndirectIndex_DeleteRangeFull-8     1532589        1532344        -0.02%
BenchmarkIndirectIndex_Delete-8              246            123            -50.00%

benchmark                                    old bytes     new bytes     delta
BenchmarkIndirectIndex_UnmarshalBinary-8     11648760      11648760      +0.00%
BenchmarkIndirectIndex_Entries-8             32768         32768         +0.00%
BenchmarkIndirectIndex_ReadEntries-8         1             1             +0.00%
BenchmarkIndirectIndex_DeleteRangeLast-8     3264          64            -98.04%
BenchmarkIndirectIndex_DeleteRangeFull-8     232710448     232624208     -0.04%
BenchmarkIndirectIndex_Delete-8              4432          3936          -11.19%
2019-01-07 11:00:35 -07:00
Edd Robinson 9d8114ef88 Fix tombstone error logic 2018-12-18 13:00:13 +00:00
Edd Robinson 419db63f7a Address PR feedback 2018-12-18 12:33:28 +00:00
Edd Robinson 262772544a Add prefix key to tombstoner
This commit adds support for "prefix keys". Prefix keys differ from
regular tombstone key entries in that the key of the entry should act as
a prefix that matches all series with the same prefix key for the given
time range.

This means only one entry is needed to delete many series.

The tombstone entries now have a maximum length of 16777215 (24 bits),
with the remaining 8 high bits available for setting further options /
meta information about the tombstone entry.

In this case, the top bit is used to indicate that the tombstone entry
is intended to be a prefix. This leaves 7 spare bits for future use.
2018-12-18 12:33:28 +00:00
Edd Robinson 1b5ad5e129 Ensure v4 tombstone file read correctly 2018-12-18 12:33:28 +00:00
Edd Robinson 90f3583fe0 Remove old tombstone version support 2018-12-18 12:33:28 +00:00
Jeff Wendling 04605eb266 tsm1: speed up deleterange for large keys
rather than starting at the first key, do a binary search to the
first key. changes O(N) when deleting the largest key to O(log N).

benchmark                                    old ns/op       new ns/op     delta
BenchmarkIndirectIndex_DeleteRangeFull-8     17884166763     738717473     -95.87%
2018-12-14 10:06:24 -07:00
Jeff Wendling 687a390aaf tsm1: add benchmarks for deletes 2018-12-14 10:06:24 -07:00
Mark Rushakoff f383e8337a test(tsdb/tsi1): test series id cache delete concurrently
Report the total number of gets, puts, and deletes at the end of the
test. I've found this kind of output to be a useful sanity check in
similar tests that exercise concurrency involving tasks.

Use a local random source in each goroutine. I unscientifically
eyeballed that to increase total operations by 5-10%.

Also call t.Parallel in a few more tests that involve disk access. This
shaves 1-2 seconds off the full tsi1 test suite on my machine.
2018-12-12 08:21:17 -08:00
Edd Robinson eba485f2be Fix nil tracker for full compactions 2018-12-11 18:30:59 +00:00
Edd Robinson c4b42c72be Add option to disable TSI metrics 2018-12-10 15:02:26 +00:00
Edd Robinson 6b63a3def7 Add option to disable sfile metrics 2018-12-10 14:36:28 +00:00
Edd Robinson 3ff39cd9dc
Merge pull request #1789 from influxdata/er-cache-size
Allow TSI cache to be dynamically altered
2018-12-10 11:23:15 +00:00
Ben Johnson df0b084543
Merge pull request #1785 from influxdata/bj-tss-file-observer
Allow stats files to be observed for finishing/unlinking.
2018-12-07 18:46:35 -07:00
Edd Robinson e0cddadffd Allow TSI cache to be dynamically altered 2018-12-07 18:35:25 +00:00
Ben Johnson 73d8c85aa2
Allow stats files to be observed for finishing/unlinking.
This commit adds the `.tss` files generated for TSM statistics to
the `FileObserver` so that package users can be notified when new
stats files are created and removed.
2018-12-07 10:20:32 -07:00
Edd Robinson e13309ebbe Fix metric names 2018-12-07 16:37:17 +00:00
Edd Robinson b015757c06 Ensure all tsi1 metrics support multiple instances 2018-12-07 14:32:34 +00:00
Edd Robinson bff655786f Ensure tsdb metrics properly registered 2018-12-07 14:32:34 +00:00
Edd Robinson aa936df138 Ensure all tsm1 metrics support multiple instances 2018-12-07 14:32:34 +00:00
Edd Robinson d94f898c8b WIP 2018-12-07 14:32:34 +00:00
Edd Robinson 79b108d174 Fix bug with slice reuse 2018-12-07 14:32:34 +00:00
Edd Robinson de491968ba Fix rebase 2018-12-07 14:32:34 +00:00
Edd Robinson 93892c20ab Fix test 2018-12-07 14:32:34 +00:00
Edd Robinson a8b6827c9e megacheck 2018-12-07 14:32:34 +00:00
Edd Robinson 2bb558a9d1 Ensure fileset protected by lock 2018-12-07 14:32:34 +00:00
Edd Robinson a1804d27be Fix race 2018-12-07 14:32:34 +00:00
Edd Robinson f9a2f7a017 go fmt 2018-12-07 14:32:34 +00:00
Edd Robinson e0c10227d0 Fix metric issue in series file 2018-12-07 14:32:34 +00:00
Edd Robinson 7960ccc320 Add TSI index metrics 2018-12-07 14:32:34 +00:00
Edd Robinson 55caa0fe54 Add RHH metrics 2018-12-07 14:32:34 +00:00
Edd Robinson d1fe2bc188 Add series file metrics 2018-12-07 14:32:34 +00:00
Edd Robinson 8ca637bd80 Refactor default labels and retention metrics 2018-12-07 14:32:34 +00:00
Edd Robinson 6c5dec8f88 Refactor tracker names 2018-12-07 14:32:34 +00:00
Edd Robinson 44e5fbae0a Convert WAL stats 2018-12-07 14:32:34 +00:00
Edd Robinson 3b980ed7e3 Convert Cache statistics 2018-12-07 14:32:34 +00:00
Edd Robinson d61b9f1645 Convert Filestore stats 2018-12-07 14:32:34 +00:00
Edd Robinson f56bc0853f Convert TSM compaction stats to Prom metrics
This commits converts all the 1.x TSM compaction statistics, which
previously were written to an _internal db, to Prometheus metrics.
2018-12-07 14:32:34 +00:00
Edd Robinson 186e0392ed Address PR feedback 2018-11-30 10:54:24 +00:00
Edd Robinson e11789f46a Omit unused receiver name: ST1006 2018-11-30 10:54:24 +00:00
Edd Robinson eaa4a4f49a Removes unused methods: U1000 2018-11-30 10:54:24 +00:00
Edd Robinson 9403c1ec8e Ensure error strings not capitalised ST1005 2018-11-30 10:54:24 +00:00
Edd Robinson 308a5148cf Remove iterators 2018-11-30 10:54:24 +00:00
Ben Johnson 98d24f7e3c
Merge pull request #1625 from influxdata/remove-influxdb-dependency
Remove influxdb dependency.
2018-11-29 14:23:00 -07:00
Ben Johnson 0084d4d824
Remove influxdb dependency. 2018-11-29 11:44:22 -07:00
Edd Robinson 7ccb201b80
Merge pull request #1332 from zhulongcheng/rm-create-series
Remove Index.CreateSeriesIfNotExists
2018-11-29 18:36:48 +00:00
Ben Johnson 1862b4421d
Integrate scanned values statistics tracking. 2018-11-28 15:32:06 -07:00
Ben Johnson e22aff46cb
Force create TSS files.
This commit replaces an `os.OpenFile()` call with an `os.Create()`
call which drops `O_EXCL` for `O_TRUNC` since `.tss` files are only
created after `.tsm` files so lingering temporary files are safe to
overwrite.
2018-11-27 08:07:21 -07:00
zhulongcheng ed799a3d6c remove Index.CreateSeriesIfNotExists 2018-11-21 20:16:45 +08:00
zhulongcheng dbfa140cc4 remove Engine.CreateSeriesIfNotExists 2018-11-21 20:16:45 +08:00
zhulongcheng 085ce852b7 remove CreateSeriesIfNotExists from engine tests 2018-11-21 20:16:45 +08:00
Mark Rushakoff 8ab01c99c0 test(tsdb/tsm1): skip long tests in short mode
The tsdb/tsm1 package was one of the test suites that took the longest
to run in platform with go test -short. The rule of thumb on the Go
project is that short mode should skip any individual test that takes
longer than one second. This change skips two such tests, and it
eliminates a string concatenation loop in two other tests, so that they
report completion in "0.00s" rather than about 0.94s, on my machine.

These cumulative changes take `go test -short ./tsdb/tsm1` from about 14
seconds to about 7 seconds on my machine.
2018-11-16 08:06:23 -08:00
Christopher M. Wolff bbd460e7d9
Add method QueryRawJSON to influxql.service (for querytest tool) (#1402) 2018-11-15 10:45:38 -08:00
zhulongcheng e7bc29a590 reduce parsing and copying of tags 2018-11-15 20:45:16 +08:00
Stuart Carnie 305ebb8729 fix: Allow compactor to make progress if v.MaxTime() != entry.MaxTime 2018-11-14 12:14:45 +00:00
Stuart Carnie b35533e7f7 chore: Compactor test which replicates issue #10465
Due to an encoding bug with simple8b, it is possible that the
MaxTime for a TSM index entry does not match the last encoded timestamp.
2018-11-14 12:14:43 +00:00
Jeff Wendling f731ed595d
Merge pull request #1358 from influxdata/jmw-test-explode-points
test(tsdb): add test for explode points
2018-11-13 14:41:29 -07:00
Mark Rushakoff 1ab9c80ae8 fix(tsdb): eliminate data race from *SeriesIDSet.Clone
And add a test to cover that.

The data race would look roughly like:

```
WARNING: DATA RACE
Write at 0x00c000024e18 by goroutine 8:
  github.com/RoaringBitmap/roaring.(*roaringArray).markAllAsNeedingCopyOnWrite()
      /Users/mr/go/pkg/mod/github.com/!roaring!bitmap/roaring@v0.4.16/roaringarray.go:881 +0x6b
  github.com/RoaringBitmap/roaring.(*roaringArray).clone()
      /Users/mr/go/pkg/mod/github.com/!roaring!bitmap/roaring@v0.4.16/roaringarray.go:266 +0x808
  github.com/RoaringBitmap/roaring.(*Bitmap).Clone()
      /Users/mr/go/pkg/mod/github.com/!roaring!bitmap/roaring@v0.4.16/roaring.go:385 +0x58
  github.com/influxdata/platform/tsdb.(*SeriesIDSet).CloneNoLock()
      /Users/mr/go/src/github.com/influxdata/platform/tsdb/series_set.go:229 +0x73
  github.com/influxdata/platform/tsdb.(*SeriesIDSet).Clone()

Previous write at 0x00c000024e18 by goroutine 7:
  github.com/RoaringBitmap/roaring.(*roaringArray).markAllAsNeedingCopyOnWrite()
      /Users/mr/go/pkg/mod/github.com/!roaring!bitmap/roaring@v0.4.16/roaringarray.go:881 +0x6b
  github.com/RoaringBitmap/roaring.(*roaringArray).clone()
      /Users/mr/go/pkg/mod/github.com/!roaring!bitmap/roaring@v0.4.16/roaringarray.go:266 +0x808
  github.com/RoaringBitmap/roaring.(*Bitmap).Clone()
      /Users/mr/go/pkg/mod/github.com/!roaring!bitmap/roaring@v0.4.16/roaring.go:385 +0x58
  github.com/influxdata/platform/tsdb.(*SeriesIDSet).CloneNoLock()
      /Users/mr/go/src/github.com/influxdata/platform/tsdb/series_set.go:229 +0x73
  github.com/influxdata/platform/tsdb.(*SeriesIDSet).Clone()
      /Users/mr/go/src/github.com/influxdata/platform/tsdb/series_set.go:223 +0x7b
```
2018-11-13 08:12:38 -08:00
Jeff Wendling 704941d624 test(tsdb): add test for explode points 2018-11-12 17:36:33 -07:00
Jeff Wendling 39f4908946 fix(storage): allow disabling the WAL
We were passing a non-nil tsm1.Log containing a nil *tsm1.WAL which
would cause a panic when it was attempted to be used. Instead, always
pass a non-nil WAL.

We change the storage engine code to not pass in a nil WAL, and
additionally add a defensive check to change any nil WALs into a
NopWAL.
2018-11-09 10:45:24 -07:00
Jeff Wendling 25532778df fix(tsm1): fix max concurrent compaction logic 2018-11-09 10:14:32 -07:00
Jeff Wendling 4b504b84df respond to review feedback
- Add some documentation.
- Move compaction planner to an option instead of config.

The latter fits with the general theme of having config be things
that can be specified in a toml, and everything else being an
option.
2018-11-08 11:39:36 -07:00
Jeff Wendling a1b5b322bb some more refactoring
- add helpers to get directories out
- change FileStoreObserver to be an option rather than config.
2018-11-08 11:39:36 -07:00
Jeff Wendling 22e23d6e31 final touches
- move default directories to the storage package
- make the directory layout match before
- clean up some dead missed functions
2018-11-08 11:39:36 -07:00
Jeff Wendling 2cbc2ee896 refactor wal out, paths, and options 2018-11-08 11:39:36 -07:00
Jessica Obermark 932b0bf01a compat: Package to convert old to new config 2018-11-08 11:39:36 -07:00
Jeff Wendling 0d411023f2 config: clean up
- Breaks the weird cycle that existed with the EngineOptions
- Removes a bunch of useless parameters
- Moves around a bunch of defaults
2018-11-08 11:39:36 -07:00
zhulongcheng 594664a876 remove RegisteredIndexes tests 2018-11-05 00:00:02 +08:00
zhulongcheng aeefeb2eed remove RegisteredIndexes method 2018-11-04 21:56:57 +08:00
Edd Robinson 1857bf1084 Fix TSM index maxtime modification 2018-11-02 18:39:30 -06:00
Mark Rushakoff 985c260af7 chore(storage,tsdb): fix megacheck errors 2018-11-01 12:54:46 -07:00
Stuart Carnie a0300064df feat(tsm1): Improve performance of Gorilla float block decoding
```
name                        old time/op   new time/op    delta
FloatArrayDecodeAll/1-8      45.9ns ± 1%    13.8ns ± 1%   -70.00%  (p=0.000 n=9+9)
FloatArrayDecodeAll/55-8      686ns ± 0%     232ns ± 1%   -66.10%  (p=0.000 n=9+8)
FloatArrayDecodeAll/550-8    5.78µs ± 0%    2.22µs ± 1%   -61.61%  (p=0.000 n=9+9)
FloatArrayDecodeAll/1000-8   10.2µs ± 2%     4.0µs ± 5%   -60.47%  (p=0.000 n=10+10)

name                        old speed     new speed      delta
FloatArrayDecodeAll/1-8     414MB/s ± 1%  1383MB/s ± 1%  +233.76%  (p=0.000 n=9+9)
FloatArrayDecodeAll/55-8    144MB/s ± 0%   424MB/s ± 1%  +194.19%  (p=0.000 n=9+9)
FloatArrayDecodeAll/550-8   133MB/s ± 0%   346MB/s ± 1%  +160.09%  (p=0.000 n=9+10)
FloatArrayDecodeAll/1000-8  135MB/s ± 2%   340MB/s ± 5%  +153.03%  (p=0.000 n=10+10)
```
2018-11-01 18:59:20 +00:00
Edd Robinson 353df7edca Fix imports 2018-11-01 18:59:20 +00:00
Edd Robinson e282d012c8 Address PR feedback 2018-11-01 18:59:20 +00:00
Stuart Carnie c21336af0a fix(encoding): Improve array string encoding perf a little more
Encode the compressed data at the start internal buffer. This ensures
the returned slice maintains the entire capacity and is available for
subsequent use.

When we pool / reuse string buffers, this will help considerably.

Improvements over previous commit:

```
name                        old time/op    new time/op    delta
EncodeStrings/10/batch-8       542ns ± 1%     355ns ± 2%   -34.53%  (p=0.008 n=5+5)
EncodeStrings/100/batch-8     5.29µs ± 1%    3.58µs ± 2%   -32.20%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8    48.6µs ± 0%    36.2µs ± 2%   -25.40%  (p=0.008 n=5+5)

name                        old alloc/op   new alloc/op   delta
EncodeStrings/10/batch-8        704B ± 0%        0B       -100.00%  (p=0.008 n=5+5)
EncodeStrings/100/batch-8     9.47kB ± 0%    0.00kB       -100.00%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8    90.1kB ± 0%     0.0kB       -100.00%  (p=0.008 n=5+5)

name                        old allocs/op  new allocs/op  delta
EncodeStrings/10/batch-8        0.00           0.00           ~     (all equal)
EncodeStrings/100/batch-8       1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
EncodeStrings/1000/batch-8      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
```
2018-11-01 18:59:20 +00:00
Stuart Carnie 296d39059a fix(encoding): Improve simple8b another 6%; fix inconsequential bug
simple8b encodes deltas[1:], thus deltas[0] >= simple8b.MaxValue is
invalid.

Also changed loop calculating deltas, RLE and max to be similar to
batch timestamp, for greater consistency.

Improvements over previous commit:

```
name                             old time/op    new time/op    delta
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    1.50µs ± 1%    1.48µs ± 1%  -1.40%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    6.10µs ± 0%    5.69µs ± 2%  -6.58%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    1.50µs ± 1%    1.49µs ± 0%  -1.21%  (p=0.008 n=5+5)
```

Improvements overall:

```
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    2.04µs ± 0%    1.48µs ± 1%  -27.25%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    8.80µs ± 2%    5.69µs ± 2%  -35.29%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    2.03µs ± 1%    1.49µs ± 0%  -26.93%  (p=0.008 n=5+5)
```
2018-11-01 18:59:20 +00:00
Stuart Carnie 9fa01f7115 feat(encoding): Improve timestamp encoding
Timestamp improvements prior to any improvements to simple8b

```
name                               old time/op    new time/op    delta
name                               old time/op    new time/op    delta
EncodeTimestamps/1000_seq/batch-8    2.64µs ± 1%    1.36µs ± 1%  -48.25%  (p=0.008 n=5+5)
EncodeTimestamps/1000_ran/batch-8    64.0µs ± 1%    32.2µs ± 1%  -49.64%  (p=0.008 n=5+5)
EncodeTimestamps/1000_dup/batch-8    9.32µs ± 0%    1.30µs ± 1%  -86.06%  (p=0.008 n=5+5)
```
2018-11-01 18:59:20 +00:00
Stuart Carnie a339f8f620 feat(encoding): Improve integer and simple8b encoding performance
simple8b EncodeAll improvements should

```
name                     old time/op  new time/op  delta
EncodeAll/1_bit-8        28.5µs ± 1%  28.6µs ± 1%     ~     (p=0.133 n=9+10)
EncodeAll/2_bits-8       28.9µs ± 2%  28.7µs ± 0%     ~     (p=0.068 n=10+8)
EncodeAll/3_bits-8       29.3µs ± 1%  28.8µs ± 0%   -1.70%  (p=0.000 n=10+10)
EncodeAll/4_bits-8       29.6µs ± 1%  29.1µs ± 1%   -1.85%  (p=0.000 n=10+10)
EncodeAll/5_bits-8       30.6µs ± 1%  29.8µs ± 2%   -2.70%  (p=0.000 n=10+10)
EncodeAll/6_bits-8       31.3µs ± 1%  30.0µs ± 1%   -4.08%  (p=0.000 n=9+9)
EncodeAll/7_bits-8       32.6µs ± 1%  30.8µs ± 0%   -5.49%  (p=0.000 n=9+9)
EncodeAll/8_bits-8       33.6µs ± 2%  31.0µs ± 1%   -7.77%  (p=0.000 n=10+9)
EncodeAll/10_bits-8      34.9µs ± 0%  31.9µs ± 2%   -8.55%  (p=0.000 n=9+10)
EncodeAll/12_bits-8      36.8µs ± 1%  32.6µs ± 1%  -11.35%  (p=0.000 n=9+10)
EncodeAll/15_bits-8      39.8µs ± 1%  34.1µs ± 2%  -14.40%  (p=0.000 n=10+10)
EncodeAll/20_bits-8      45.2µs ± 3%  36.2µs ± 1%  -19.97%  (p=0.000 n=10+9)
EncodeAll/30_bits-8      55.0µs ± 0%  40.9µs ± 1%  -25.62%  (p=0.000 n=9+9)
EncodeAll/60_bits-8      86.2µs ± 1%  55.2µs ± 1%  -35.92%  (p=0.000 n=10+10)
EncodeAll/combination-8   582µs ± 2%   502µs ± 1%  -13.80%  (p=0.000 n=9+9)
```

EncodeIntegers:

```
name                             old time/op    new time/op    delta
EncodeIntegers/1000_seq/batch-8    2.04µs ± 0%    1.50µs ± 1%  -26.22%  (p=0.008 n=5+5)
EncodeIntegers/1000_ran/batch-8    8.80µs ± 2%    6.10µs ± 0%  -30.73%  (p=0.008 n=5+5)
EncodeIntegers/1000_dup/batch-8    2.03µs ± 1%    1.50µs ± 1%  -26.04%  (p=0.008 n=5+5)
```

EncodeTimestamps (ran is improved due to simple8b improvements)

```
name                               old time/op    new time/op    delta
EncodeTimestamps/1000_seq/batch-8    2.64µs ± 1%    2.65µs ± 2%     ~     (p=0.310 n=5+5)
EncodeTimestamps/1000_ran/batch-8    64.0µs ± 1%    33.8µs ± 1%  -47.23%  (p=0.008 n=5+5)
EncodeTimestamps/1000_dup/batch-8    9.32µs ± 0%    9.28µs ± 1%     ~     (p=0.087 n=5+5)
```
2018-11-01 18:59:20 +00:00
Edd Robinson 5e7b2cb273 Fix index bug in float encoder 2018-11-01 18:59:20 +00:00
Edd Robinson 80c953b774 Add TSM batch key iterator
The batch focussed TSM key iterator iterates TSM blocks, decoding and
merging blocks where appropriate using the the batch focussed
approaches.
2018-11-01 18:59:20 +00:00
Edd Robinson 5074b834cd Add batch block encoders 2018-11-01 18:59:19 +00:00
Edd Robinson ab68204683 Batch oriented unsigned encoder 2018-11-01 18:59:19 +00:00
Edd Robinson aeeef803c0 Batch oriented boolean encoders
This commit adds a tsm1 function for encoding a batch of booleans into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders using
randomly generated sets of booleans.
2018-11-01 18:59:19 +00:00
Jeff Wendling 5376530392 Improvements to batch float encoder
- Inlined the closure to avoid a function call.
- Changed append(b, make([]byte, 8)...) to inline the make call.
- Check for NaN once at the end assuming NaN is infrequent.

New performance delta comparing the current iterators to the new batch
function:

name                   old time/op    new time/op    delta
EncodeFloats/10_seq      1.32µs ± 2%    0.17µs ± 2%  -87.39%  (p=0.000 n=10+10)
EncodeFloats/10_ran      2.09µs ± 1%    0.15µs ± 0%  -92.97%  (p=0.000 n=10+9)
EncodeFloats/100_seq     8.37µs ± 2%    1.28µs ± 2%  -84.74%  (p=0.000 n=10+10)
EncodeFloats/100_ran     19.1µs ± 1%     1.3µs ± 1%  -93.08%  (p=0.000 n=9+9)
EncodeFloats/1000_seq    60.4µs ± 1%    12.6µs ± 0%  -79.13%  (p=0.000 n=9+7)
EncodeFloats/1000_ran     212µs ± 1%      12µs ± 1%  -94.53%  (p=0.000 n=9+8)

name                   old alloc/op   new alloc/op   delta
EncodeFloats/10_seq       0.00B          0.00B          ~     (all equal)
EncodeFloats/10_ran       0.00B          0.00B          ~     (all equal)
EncodeFloats/100_seq      0.00B          0.00B          ~     (all equal)
EncodeFloats/100_ran      0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_ran     0.00B          0.00B          ~     (all equal)

name                   old allocs/op  new allocs/op  delta
EncodeFloats/10_seq        0.00           0.00          ~     (all equal)
EncodeFloats/10_ran        0.00           0.00          ~     (all equal)
EncodeFloats/100_seq       0.00           0.00          ~     (all equal)
EncodeFloats/100_ran       0.00           0.00          ~     (all equal)
EncodeFloats/1000_seq      0.00           0.00          ~     (all equal)
EncodeFloats/1000_ran      0.00           0.00          ~     (all equal)
2018-11-01 18:59:19 +00:00
Edd Robinson d8b5f9d432 Batch oriented string encoders
This commit adds a tsm1 function for encoding a batch of strings into a
provided buffer. The new function also shares the buffer between the
input data and the snappy encoded output, reducing allocations.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders using
randomly generated strings.

name                old time/op    new time/op    delta
EncodeStrings/10      2.14µs ± 4%    1.42µs ± 4%   -33.56%  (p=0.000 n=10+10)
EncodeStrings/100     12.7µs ± 3%    10.9µs ± 2%   -14.46%  (p=0.000 n=10+10)
EncodeStrings/1000     132µs ± 2%     114µs ± 2%   -13.88%  (p=0.000 n=10+9)

name                old alloc/op   new alloc/op   delta
EncodeStrings/10        657B ± 0%      704B ± 0%    +7.15%  (p=0.000 n=10+10)
EncodeStrings/100     6.14kB ± 0%    9.47kB ± 0%   +54.14%  (p=0.000 n=10+10)
EncodeStrings/1000    61.4kB ± 0%    90.1kB ± 0%   +46.66%  (p=0.000 n=10+10)

name                old allocs/op  new allocs/op  delta
EncodeStrings/10        3.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeStrings/100       3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
EncodeStrings/1000      3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
2018-11-01 18:59:19 +00:00
Edd Robinson 7032aed1c3 Batch oriented timestamp encoders
This commit adds a tsm1 function for encoding a batch of timestamps into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice, a randomly generated input slice and a
duplicate slice. All slices are sorted.

name                       old time/op    new time/op    delta
EncodeTimestamps/10_seq       153ns ± 2%     104ns ± 2%  -31.62%  (p=0.000 n=9+10)
EncodeTimestamps/10_ran       191ns ± 2%     142ns ± 0%  -25.73%  (p=0.000 n=10+9)
EncodeTimestamps/10_dup       114ns ± 1%      68ns ± 4%  -39.77%  (p=0.000 n=8+10)
EncodeTimestamps/100_seq      704ns ± 2%     321ns ± 2%  -54.44%  (p=0.000 n=9+9)
EncodeTimestamps/100_ran     7.27µs ± 4%    7.01µs ± 2%   -3.59%  (p=0.000 n=10+10)
EncodeTimestamps/100_dup      756ns ± 3%     396ns ± 2%  -47.57%  (p=0.000 n=10+10)
EncodeTimestamps/1000_seq    6.32µs ± 1%    2.46µs ± 2%  -61.01%  (p=0.000 n=8+10)
EncodeTimestamps/1000_ran     108µs ± 0%      68µs ± 3%  -37.57%  (p=0.000 n=8+10)
EncodeTimestamps/1000_dup    7.26µs ± 1%    3.64µs ± 1%  -49.80%  (p=0.000 n=10+8)

name                       old alloc/op   new alloc/op   delta
EncodeTimestamps/10_seq       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/10_ran       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/10_dup       0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_seq      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_ran      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/100_dup      0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_ran     0.00B          0.00B          ~     (all equal)
EncodeTimestamps/1000_dup     0.00B          0.00B          ~     (all equal)

name                       old allocs/op  new allocs/op  delta
EncodeTimestamps/10_seq        0.00           0.00          ~     (all equal)
EncodeTimestamps/10_ran        0.00           0.00          ~     (all equal)
EncodeTimestamps/10_dup        0.00           0.00          ~     (all equal)
EncodeTimestamps/100_seq       0.00           0.00          ~     (all equal)
EncodeTimestamps/100_ran       0.00           0.00          ~     (all equal)
EncodeTimestamps/100_dup       0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_seq      0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_ran      0.00           0.00          ~     (all equal)
EncodeTimestamps/1000_dup      0.00           0.00          ~     (all equal)
2018-11-01 18:59:19 +00:00
Edd Robinson b463f97b15 Batch oriented int encoders
This commit adds a tsm1 function for encoding a batch of ints into a
provided buffer.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice, a randomly generated input slice and a
duplicate slice:

name                     old time/op    new time/op    delta
EncodeIntegers/10_seq       144ns ± 2%      41ns ± 1%   -71.46%  (p=0.000 n=10+10)
EncodeIntegers/10_ran       304ns ± 7%     140ns ± 2%   -53.99%  (p=0.000 n=10+10)
EncodeIntegers/10_dup       147ns ± 4%      41ns ± 2%   -72.14%  (p=0.000 n=10+9)
EncodeIntegers/100_seq      483ns ± 7%     208ns ± 1%   -56.98%  (p=0.000 n=10+9)
EncodeIntegers/100_ran     1.64µs ± 7%    1.01µs ± 1%   -38.42%  (p=0.000 n=9+9)
EncodeIntegers/100_dup      484ns ±14%     210ns ± 2%   -56.63%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq    3.11µs ± 2%    1.81µs ± 2%   -41.68%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran    16.9µs ±10%    11.0µs ± 2%   -34.58%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup    3.05µs ± 3%    1.81µs ± 2%   -40.71%  (p=0.000 n=10+8)

name                     old alloc/op   new alloc/op   delta
EncodeIntegers/10_seq       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_ran       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_dup       32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_seq      32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_ran       128B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_dup      32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq     32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran    1.15kB ± 0%    0.00kB       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup     32.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)

name                     old allocs/op  new allocs/op  delta
EncodeIntegers/10_seq        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_ran        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/10_dup        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_seq       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_ran       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/100_dup       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_seq      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_ran      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
EncodeIntegers/1000_dup      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
2018-11-01 18:59:19 +00:00
Edd Robinson 8190edbf14 Batch oriented float encoders
This commit adds a tsm1 function for encoding a batch of floats into a
buffer. Further, it replaces the `bitstream` library used in the
existing encoders (and all the current decoders) with inlined bit
expressions within the encoder, significantly reducing the function call
overhead for larger batches.

The following benchmarks compare the performance of the existing
iterator based encoders, and the new batch oriented encoders. They look
at a sequential input slice and a randomly generated input slice.

name                   old time/op    new time/op    delta
EncodeFloats/10_seq      1.14µs ± 3%    0.24µs ± 3%  -78.94%  (p=0.000 n=10+10)
EncodeFloats/10_ran      1.69µs ± 2%    0.21µs ± 3%  -87.43%  (p=0.000 n=10+10)
EncodeFloats/100_seq     7.07µs ± 1%    1.72µs ± 1%  -75.62%  (p=0.000 n=7+9)
EncodeFloats/100_ran     15.8µs ± 4%     1.8µs ± 1%  -88.60%  (p=0.000 n=10+9)
EncodeFloats/1000_seq    50.2µs ± 3%    16.2µs ± 2%  -67.66%  (p=0.000 n=10+10)
EncodeFloats/1000_ran     174µs ± 2%      16µs ± 2%  -90.77%  (p=0.000 n=10+10)

name                   old alloc/op   new alloc/op   delta
EncodeFloats/10_seq       0.00B          0.00B          ~     (all equal)
EncodeFloats/10_ran       0.00B          0.00B          ~     (all equal)
EncodeFloats/100_seq      0.00B          0.00B          ~     (all equal)
EncodeFloats/100_ran      0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_seq     0.00B          0.00B          ~     (all equal)
EncodeFloats/1000_ran     0.00B          0.00B          ~     (all equal)

name                   old allocs/op  new allocs/op  delta
EncodeFloats/10_seq        0.00           0.00          ~     (all equal)
EncodeFloats/10_ran        0.00           0.00          ~     (all equal)
EncodeFloats/100_seq       0.00           0.00          ~     (all equal)
EncodeFloats/100_ran       0.00           0.00          ~     (all equal)
EncodeFloats/1000_seq      0.00           0.00          ~     (all equal)
EncodeFloats/1000_ran      0.00           0.00          ~     (all equal)
2018-11-01 18:59:19 +00:00
Edd Robinson 29114ec5f2 Rename time batch decoders 2018-11-01 18:59:19 +00:00
Edd Robinson 095ed44f48 Rename unsigned batch decoders 2018-11-01 18:59:19 +00:00
Edd Robinson d7a4b814d4 Rename string batch decoders 2018-11-01 18:59:19 +00:00
Edd Robinson db84dfae92 Rename boolean batch decoders 2018-11-01 18:59:19 +00:00
Edd Robinson bcb7b5d44a Rename integer batch decoders 2018-11-01 18:59:19 +00:00
Edd Robinson 2e00954703 Rename float batch decoders 2018-11-01 18:59:19 +00:00
Jeff Wendling 6830329ef4 review feedback 2018-10-31 15:41:39 -06:00
Jeff Wendling a7657ac409 tsdb: remove hll sketches
This keeps file compatability by just writing out zeros for the
sizes and offsets. Perhaps it's ok to just nuke everything and
remove the data.

It also keeps the hll package because it seems generally useful
even if it's not currently being used.
2018-10-31 15:41:39 -06:00
Jeff Wendling 381d449b82 tsm1: remove digests and backup/restore 2018-10-31 15:41:07 -06:00
Chris Goller d8548d41e1 chore(fmt): update formating with make fmt 2018-10-30 07:40:28 -05:00
Edd Robinson 46a7b8155a
Merge pull request #1170 from zhulongcheng/rm-index
refactor(tsdb): remove tsdb.Index and tsdb.IndexSet
2018-10-30 11:10:54 +00:00
Jonathan A. Sternberg 67dc4d8cdd
fix: conform to logging style guide for initial log messages
These are the log messages that get printed immediately when starting
the application for the first time. This fixes the messages to conform
to the logging style guide.
2018-10-29 16:42:55 -05:00
zhulongcheng 1dd0d33b1e fix type assertion err 2018-10-27 02:08:31 +08:00
zhulongcheng 268832ee64 remove unused seriesPointIterator 2018-10-27 02:08:31 +08:00
zhulongcheng f6104a7e78 remove unused Shard 2018-10-27 02:08:31 +08:00
zhulongcheng 9d29874e20 move SeriesFileDirectory constant to defaults package 2018-10-27 02:08:31 +08:00
zhulongcheng 5d66bbed48 remove functions for registering engine
This fix is to resolve import cycle
2018-10-27 02:08:31 +08:00
zhulongcheng 0e9185f764 remove tsdb.Index interface
This fix is to resolve #886.
2018-10-27 02:08:31 +08:00
zhulongcheng c89c79dc02 replace tsdb.Index interface with tsi1.Index instance
This fix is to remove tsdb.Index interface to resolve #886.
2018-10-27 02:08:31 +08:00
zhulongcheng c1e732782e remove tsdb.IndexSet
This fix is to resolve #886.
2018-10-27 02:08:31 +08:00
zhulongcheng 427d719af8 remove tsdb.IndexSet tests
This fix is to remove tsdb.IndexSet to resolve #886.
2018-10-27 02:08:31 +08:00
zhulongcheng 28fecc1f6f replace tsdb.IndexSet with tsi1.Index
This fix is to remove tsdb.IndexSet to resolve #886.
2018-10-27 02:08:31 +08:00