Commit Graph

103 Commits (4337f5a54769b2f085114a6b76ec662d124b99e4)

Author SHA1 Message Date
Mark Rushakoff 426a9a0b8b Use math/bits exclusively instead of go-bits
We won't be rolling back to pre-Go1.9, so prefer the standard library
over a dependency that provides backwards compatibility.
2018-03-15 12:03:24 -07:00
Stuart Carnie d135aecf02 Generate trace logs for a number of significant influx operations
* tsdb Store.Open traces all events related to opening files
    * op.name : tsdb.open
* retention policy shard deletions
    * op.name : retention.delete_check
* all TSM compaction strategies
    * op.name : tsm1.compact_group
* series file compactions
    * op.name : series_partition.compaction
* continuous query execution (if logging enabled)
    * op.name : continuous_querier.execute
* TSI log file compaction
    * op_name: index.tsi.compact_log_file
* TSI level compaction
    * op.name: index.tsi.compact_to_level
2018-02-21 15:08:49 -07:00
Edd Robinson e5c8fd9dc5 Ensure nil sketches never returned 2018-02-09 15:29:42 +00:00
Edd Robinson 0d164f3164
WIP - tsi integration sketches 2018-02-07 14:52:13 -07:00
Edd Robinson 544329380f
Add empty series sketches back to tsi1 index
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.

The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.

This commit does not provide working sketches for post-compaction on the
tsi1 index.
2018-02-07 14:52:13 -07:00
Jason Wilder 0649150243 Reduce read IOPS in series file due to MADV_RANDOM
This reduces read IOPS to some extent.
2018-01-31 12:38:16 -07:00
Jason Wilder aed2784376 Reduce allocations when inserting into RHH
There were many small byte slices allocated and thrown away causing
a lot of garbage at higher cardinalities.
2018-01-31 12:38:11 -07:00
Edd Robinson a2fd8dd4ed Cleanup pkg package 2018-01-21 12:08:25 -08:00
Menno Finlay-Smits acaf5f097a pkg/escape: Add benchmarks for all bytes escape/unescape funcs 2018-01-16 11:12:47 +13:00
Menno Finlay-Smits b0e871876a pkg/escape: Preallocate output in Unescape
Preallocating the capacity of the output is faster and uses less
memory than letting append() do its own (over)allocation.
2018-01-16 09:37:37 +13:00
Ben Johnson d610a79487
Merge pull request #9295 from influxdata/partition-series-file
Partition series file
2018-01-11 08:45:18 -07:00
Adam 938db68198
Update restore functionality to run in online mode, consume Enterprise backup files. (#9207)
* Live Restore + Enterprise data format compatability

* Extended ImportData to import all DB's if no db name given

* Added a new enterprise data test, and backup command now prints the backup file paths at conclusion

* Added whole-system backup test

* Update to use protobuf in all enterprise data cases

* Update to test to do cross-testing with enterprise version

* incremental enterprise backup format support
2018-01-10 13:59:18 -05:00
Ben Johnson ac4dc91c64
Partition series file. 2018-01-10 08:33:25 -07:00
Ben Johnson c0a46d2d3d
Fix series file compaction stall.
The series file compaction previously did not snapshot the max
offset before compacting and would keep compacting until it reached
the end of segment file. This caused more entries than expected into
the RHH map and this map gets exponentially slower as it gets close
to full.
2018-01-08 09:53:01 -07:00
Stuart Carnie 44780742f7 fix format issue 2017-12-27 17:27:03 -07:00
Stuart Carnie c986cac76e improve performance when writes exceed max tag values or series
```
 benchmark                                                                  old ns/op     new ns/op     delta
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesExceeded-8        6175374       2714158       -56.05%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesNotExceeded-8     344502        326312        -5.28%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_NoMaxValues-8              346734        329961        -4.84%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxSeriesExceeded-8        2414945       1996223       -17.34%

 benchmark                                                                  old allocs     new allocs     delta
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesExceeded-8        45377          128            -99.72%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesNotExceeded-8     33             20             -39.39%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_NoMaxValues-8              33             20             -39.39%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxSeriesExceeded-8        15219          71             -99.53%

 benchmark                                                                  old bytes     new bytes     delta
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesExceeded-8        1354539       480114        -64.56%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesNotExceeded-8     2101          1261          -39.98%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_NoMaxValues-8              2100          1261          -39.95%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxSeriesExceeded-8        707247        477737        -32.45%
 ```
2017-12-27 17:27:03 -07:00
Stuart Carnie d7a8368d2d Add bytesutil.Contains; improve performance of SearchBytes
```
benchmark                           old ns/op     new ns/op     delta
BenchmarkContains_True-8            67.0          41.8          -37.61%
BenchmarkContains_False-8           73.2          47.6          -34.97%
BenchmarkSearchBytes_Exists-8       51.5          34.3          -33.40%
BenchmarkSearchBytes_NotExits-8     57.7          39.8          -31.02%
```
2017-12-27 17:27:03 -07:00
Ben Johnson 8b2dbf4d83
Merge branch 'er-tsi-index-part' of https://github.com/influxdata/influxdb into er-tsi-index-part 2017-12-19 10:33:02 -07:00
Ben Johnson 107291c6b0
series file refactor 2017-12-19 10:31:33 -07:00
Edd Robinson 83032dfe54 windows test 2017-12-18 19:02:15 +00:00
Edd Robinson 2fd82645d8 Don't run test in race mode 2017-12-15 22:19:25 +00:00
Edd Robinson c476a0b4a1 Merge branch 'master' into er-tsi-index-part 2017-12-15 18:31:24 +00:00
Jason Wilder 749c9d2483 Rate limit disk IO when writing TSM files
This limits the disk IO for writing TSM files during compactions
and snapshots.  This helps reduce the spiky IO patterns on SSDs and
when compactions run very quickly.
2017-12-14 22:02:32 -07:00
Edd Robinson f6835632e7 Merge master into branch 2017-12-08 17:11:07 +00:00
Ben Johnson c36817fffc
Fix retain/release hang issues. 2017-12-06 09:09:41 -07:00
Jason Wilder c14b0e81b7 Save field types to speed up startup
This persists the field types in a shard to avoid having to scan
all the TSM files at startup.
2017-11-22 11:17:34 -07:00
Edd Robinson a5af19fc06 Address PR feedback 2017-11-17 12:43:48 +00:00
Ben Johnson ba4c9e0317
Merge remote-tracking branch 'upstream/master' into er-tsi-index-part 2017-11-14 16:14:13 -07:00
Jason Wilder 04f4c3e993 Optimize bytesutil.Pack 2017-11-13 09:02:10 -07:00
Jason Wilder 000768371f Optimized deletes in TSM index
This optimizes how deletes are processed to reduce memory usage
and improve efficiency.
2017-11-13 09:02:08 -07:00
Edd Robinson 0dd97cc84a
Add utility functions for merging k collections of sorted slices 2017-11-09 09:28:37 -07:00
Ben Johnson e05d4fdeeb
intermediate 2017-11-09 09:18:33 -07:00
Ben Johnson 48b48a8927
intermediate 2017-11-09 09:13:46 -07:00
Stuart Carnie 415ed14c53 storage service
* storage service is disabled by default
* default port 8082
* RPC interface generated using yarpc via service.proto
2017-10-25 13:38:07 -07:00
Stuart Carnie ac3bf300d3 fix overflow for 32-bit architecture 2017-10-20 10:22:28 -07:00
Stuart Carnie 47a2f8745e make (*T).Helper() optional 2017-10-20 08:59:50 -07:00
Stuart Carnie e9313876ab EXPLAIN ANALYZE
* Introduces EXPLAIN ANALYZE command, which
  produces a detailed tree of operations used to
  execute the query.

introduce context.Context to APIs

metrics package

* create groups of named measurements
* safe for concurrent access

tracing package

EXPLAIN ANALYZE implementation for OSS

Serialize EXPLAIN ANALYZE traces from remote nodes

use context.Background for tests

group with other stdlib packages

additional documentation and remove unused API

use influxdb/pkg/testing/assert

remove testify reference
2017-10-20 08:01:37 -07:00
Ben Johnson d17d0f18e0
Move copyBytes() and copyByteSlices() to bytesutil. 2017-10-18 07:19:46 -06:00
Jason Wilder ae821f4e2d Rework compaction scheduling
This changes the compaction scheduling to better utilize the available
cores that are free.  Previously, a level was planned in its own goroutine
and would kick off a number of compactions groups.  The problem with this
model was that if there were 4 groups, and 3 completed quickly, the planning
would be blocked for that level until the last group finished.  If the compactions
at the prior level are running more quickly, a large backlog could accumlate.

This now moves the planning to a single goroutine that plans each level in
succession and starts as many groups as it can.  When one group finishes,
the planning will start the next group for the level.
2017-10-03 10:48:13 -06:00
Edd Robinson a174f65595 use math/bits in HLL implementation 2017-09-26 12:51:08 +01:00
Edd Robinson 1028818ba6 Perf boost with some bit twiddling 2017-09-22 17:59:39 +01:00
Edd Robinson 5b7fc517fa Improve performance of TSI bloom filter
This commit replaces the previous hashing algorithm used by the pkg.Filter with
one based on xxhash. Further, taking from the hashing literature, we can
represent k hashes with only two hash function, where previously Filter was using
four.

Further, unlike `murmur3`, `xxhash` is allocation-free, so allocations have
dramatically reduced when inserting and checking for hashes.
2017-09-22 17:59:39 +01:00
Edd Robinson fe960b0f3a Add benchmarks for bloom filter 2017-09-22 17:59:32 +01:00
Jason Wilder db204f3eb7 Default concurrent compactions to 50% of available cores 2017-09-21 12:48:11 -06:00
Ben Johnson a40b2bb210 Simplify bloom filter invalidation. 2017-09-11 15:29:26 -06:00
Jason Wilder d3e832b462 Use offheap memory for indirect index offsets slice 2017-09-11 15:29:25 -06:00
Jason Wilder 4009223fb6 Avoid allocating murmur3.Hash too frequently
These hashes were getting allocate very frequently with high cardinality
datasets.  This allows them to be re-used.
2017-09-11 15:26:24 -06:00
Matt McCoy e43bec4a3a Test slices strings Exists* functions 2017-08-08 20:33:26 -04:00
Edd Robinson a43238618e Merge pull request #8512 from axiomhq/loglogbeta
Switch to LogLog-Beta Cardinality estimation
2017-07-07 16:14:16 +01:00
Seif Lotfy 4cb01c1768 change beta constants for the hll cardinality bias estimator 2017-06-30 07:47:16 -07:00