Commit Graph

417 Commits (9b8c81c49c1d40ad610424398ce5d5d26003a6be)

Author SHA1 Message Date
Edd Robinson 7a55735562
Add option to set LogFile compaction size 2018-02-07 14:52:13 -07:00
Edd Robinson 544329380f
Add empty series sketches back to tsi1 index
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.

The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.

This commit does not provide working sketches for post-compaction on the
tsi1 index.
2018-02-07 14:52:13 -07:00
Jason Wilder 20d429c62b Use cached tags when applying series entries 2018-01-30 16:02:50 -07:00
Ben Johnson da8568d86c
Remove unused field. 2018-01-30 10:34:29 -07:00
Ben Johnson a6d11585b3
Add TSI compaction interruption. 2018-01-30 10:34:17 -07:00
Ben Johnson 0652effb78
Interrupt TSI & Series File Compactions 2018-01-30 10:34:17 -07:00
Edd Robinson b19edd55ac Ensure shard-level cardinality is correct 2018-01-29 16:22:42 +00:00
Edd Robinson 7931c78e2b Further simplifications 2018-01-23 06:57:51 -08:00
Edd Robinson 42c3adeffc simplify packages under tsdb 2018-01-21 09:41:27 -08:00
Edd Robinson 030fdc7966 Remove unused code/cleanup index packages 2018-01-20 13:56:28 +00:00
Jason Wilder d755daede8 Add ability to enable/disable tsi compactions 2018-01-18 14:25:58 -07:00
Jason Wilder a88ac031de Fix MeasurementHasSeries returning incorrect value
If all the series in a measurement were tombstone, MeasurementHasSeries
would return true because the ok var was re-used from a prior check
earlier in the func.  This caused it to be true all the time unless
the measurment was actually tombstoned.
2018-01-18 13:05:04 -07:00
Jason Wilder 28edf1392a Use full 32bits for series IDs
This reworks the series ID allocation to prevent an overflow issue.
2018-01-18 09:45:36 -07:00
Jason Wilder 2d4790b9f7
Merge pull request #9328 from influxdata/jw-delete-tsi-perf
Speed up deletes for tsi
2018-01-17 08:31:47 -07:00
Jason Wilder 5d6b8fc834 Drop measurement after series
This separates out the dropping of a measurement from the series
to avoid frequent checks to see if a measurement still has series.
The series are dropped individually and we keep track of which
measurements are involved and then delete each measurment afterwards.
2018-01-17 07:57:25 -07:00
Edd Robinson de0e9b1a4b Unify approach to short-circuit auth 2018-01-17 14:00:24 +00:00
Ben Johnson b36b9f109f
Merge pull request #9324 from influxdata/bj-tsi-log-entry-short-buffer
Fix LogEntry.UnmarshalBinary() short buffer check.
2018-01-16 18:42:09 -07:00
Ben Johnson 3937fed7a1
Add tsi1.Partition closing check before compaction. 2018-01-16 13:32:44 -07:00
Ben Johnson 5f9d53b586
Fix LogEntry.UnmarshalBinary() short buffer check. 2018-01-16 13:14:26 -07:00
Edd Robinson 338f284bc9 Simplify series set Merge logic 2018-01-16 14:56:54 +00:00
Edd Robinson d890f29fcb Remove redundant index methods
Now that each shard-local index is maintaining a bitset of series ids,
tracking the series present in the local shard's tsm engine, there is no
need to track shards in the `inmem` index.

This commit removes the methods associated with tracking those
series/shard relationships.
2018-01-16 14:56:54 +00:00
Ben Johnson e1aff89299
Fix data race. 2018-01-15 11:53:49 -07:00
Ben Johnson d0429dc582
Batch insert series for inmem index. 2018-01-15 11:13:59 -07:00
Ben Johnson cc30abcae6
Fix TSI MeasurementExists() test. 2018-01-15 08:28:53 -07:00
Edd Robinson a2ece0a49a Pass series id in via Index API 2018-01-15 12:00:31 +00:00
Ben Johnson 1c4ab05c7e Add fast TSI MeasurementHasSeries() check. 2018-01-15 12:00:30 +00:00
Ben Johnson b07e41fa7f Fix partition series set building. 2018-01-15 12:00:30 +00:00
Ben Johnson 9a15130a4c Persist TSI tombstones. 2018-01-15 12:00:30 +00:00
Ben Johnson 69757ccd15 Fix partition series set building. 2018-01-15 12:00:30 +00:00
Edd Robinson 3d153e3808 Don't creation series in partition if none assigned 2018-01-15 12:00:30 +00:00
Edd Robinson 4913f2b4ac Refactor test Index/Series file with correct open 2018-01-15 12:00:30 +00:00
Edd Robinson 96d55c4471 Fix reference bug 2018-01-15 12:00:30 +00:00
Edd Robinson bb6bfad5ea Ensure inmem index updated properly 2018-01-15 12:00:30 +00:00
Edd Robinson 7f244cb29f Use models series key for partition allocation
There are two series key formats: the `models` package format, which is
also line-protocol format, and the `tsdb` package format, which is used
by the series file when serialising series keys.

When writing to a series, rather than taking a `models` format key from
the `coordinator` package and then converting it to a `tsdb` package
format, it would be cheaper to keep the key in the `models` format
before hashing it to determine which partition the key lives in.
2018-01-15 12:00:30 +00:00
Edd Robinson a4bef3a4bc Refactoring delete tests 2018-01-15 12:00:30 +00:00
Jason Wilder ba9a5af7eb Mark series deleted in series file
This commit adds the ability to correctly mark a series as deleted in
the global series file. Whenever a shard engine determines that a series
should be deleted, it checks with each shard's bitset for series that
are to be deleted and are no longer contained in any shard-local
bitsets.

These series are then removed from the series file.
2018-01-15 12:00:30 +00:00
Ben Johnson d610a79487
Merge pull request #9295 from influxdata/partition-series-file
Partition series file
2018-01-11 08:45:18 -07:00
Edd Robinson e2262d3e8e Implement series id tracking in TSI index 2018-01-11 01:01:54 +00:00
Edd Robinson e610e7c21d Track undeleted series IDs per-shard with inmem
This commit adds a bitset into each shard's in-memory index, to be used to
track undeleted series ids. Currently tsi1 support is not implemented.

When new series are added to the shard, the series id is added
to the bitset. When series are deleted from the shard, the series
ids are removed from the bitset.

Becasue each shard shares the same inmem index reference, the bitset
is stored in the `ShardIndex`, which is local to each shard, and then
different references are passed into the shared `Index` object, depending
on which shard is writing the series.
2018-01-11 01:01:54 +00:00
Edd Robinson e6f3aa107a Move SeriesSet to tsdb.SeriesIDSet 2018-01-11 01:01:54 +00:00
Edd Robinson 35543e385f Tidy up 2018-01-11 01:01:54 +00:00
Ben Johnson 9bf45fcae0
Improve inmem insert performance with non-sequential series ids. 2018-01-10 13:08:16 -07:00
Ben Johnson ac4dc91c64
Partition series file. 2018-01-10 08:33:25 -07:00
Edd Robinson f73a710320 More insight into assertion 2018-01-04 16:23:50 +00:00
Ben Johnson 31c50532db
Add series existence check in tsi1. 2018-01-03 12:20:35 -07:00
Ben Johnson 98486a284a
Merge pull request #9265 from benbjohnson/series-file-compaction
Sequential series file id & series file segmentation
2018-01-03 10:05:59 -07:00
Ben Johnson 3900c948a2
Fix requested changes. 2018-01-03 10:04:12 -07:00
Ben Johnson 56980b0d24
Segment series file 2017-12-29 11:57:45 -07:00
Stuart Carnie ed207b54c3 updates after TSI / series file merge 2017-12-29 10:58:25 -07:00
Stuart Carnie 638caf3b58 ensure correctly aligned for 32-bit architecture 2017-12-29 07:58:52 -07:00
Stuart Carnie 455013a486 updates per PR review comments 2017-12-29 07:58:52 -07:00
Stuart Carnie 98aa368b7f prefer NameBytes 2017-12-29 07:58:52 -07:00
Stuart Carnie 5dfe3b2645 inmem startup improvments
* only call ParseTags when necessary
* remove dependency on inmem.Series in tsdb test package
* Measurement and Series are no longer exported. Their use is restricted
  to the inmem package
* improve Measurement and Series types by exporting immutable
  fields and removing unnecessary APIs and locks

Reduced startup time from 28s to 17s. Overall improvement including
#9162 reduces startup from 46s to 17s for 1MM series across 14 shards.
2017-12-29 07:58:52 -07:00
Stuart Carnie ba17264ddd fixes after merge 2017-12-27 17:29:32 -07:00
Stuart Carnie c986cac76e improve performance when writes exceed max tag values or series
```
 benchmark                                                                  old ns/op     new ns/op     delta
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesExceeded-8        6175374       2714158       -56.05%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesNotExceeded-8     344502        326312        -5.28%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_NoMaxValues-8              346734        329961        -4.84%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxSeriesExceeded-8        2414945       1996223       -17.34%

 benchmark                                                                  old allocs     new allocs     delta
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesExceeded-8        45377          128            -99.72%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesNotExceeded-8     33             20             -39.39%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_NoMaxValues-8              33             20             -39.39%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxSeriesExceeded-8        15219          71             -99.53%

 benchmark                                                                  old bytes     new bytes     delta
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesExceeded-8        1354539       480114        -64.56%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxValuesNotExceeded-8     2101          1261          -39.98%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_NoMaxValues-8              2100          1261          -39.95%
 BenchmarkShardIndex_CreateSeriesListIfNotExists_MaxSeriesExceeded-8        707247        477737        -32.45%
 ```
2017-12-27 17:27:03 -07:00
Ben Johnson 4d7426ebbd
Fix race bug. 2017-12-21 10:12:21 -07:00
Ben Johnson 679335d027
Measurement iterator fix. 2017-12-20 15:43:17 -07:00
Ben Johnson 553c092484
Merge branch 'er-tsi-index-part' of https://github.com/influxdata/influxdb into er-tsi-index-part 2017-12-20 15:22:24 -07:00
Ben Johnson d8b1d208c0
rebase 2017-12-20 15:13:34 -07:00
Edd Robinson 9767660b8f Use MeasurementIterator 2017-12-19 19:23:01 +00:00
Ben Johnson 8b2dbf4d83
Merge branch 'er-tsi-index-part' of https://github.com/influxdata/influxdb into er-tsi-index-part 2017-12-19 10:33:02 -07:00
Ben Johnson 107291c6b0
series file refactor 2017-12-19 10:31:33 -07:00
Edd Robinson 72c0ec89fd Fix race on in-memory index 2017-12-18 16:22:19 +00:00
Edd Robinson bde66f19bc Adjust series file size and partitions 2017-12-18 13:17:42 +00:00
Edd Robinson 3bfe525705 Add 32-bit support to series file
This commit ensures that the series file should work appropriately on
32-bit architecturs. It does this by reducing the maximum size of a
series file to 512MB on 32-bit systems, which should be fully
addressable.

It further updates tests so that the series file size can be reduced
further when running many tests in parallel on 32-bit architectures.
2017-12-15 15:47:26 +00:00
Edd Robinson 7e662a1294 Fix some races 2017-12-15 01:18:36 +00:00
Edd Robinson 289d1f8d44 Allow iterators to return if shard is closing 2017-12-15 00:46:43 +00:00
Edd Robinson 9e3b17fd09 Ensure deleted series are not returned via iterators 2017-12-14 21:29:35 +00:00
Edd Robinson 7080ffcaaa Fix MANIFEST test 2017-12-13 15:55:49 +00:00
Edd Robinson f1bcc97e89 Fix auth tests 2017-12-12 21:25:35 +00:00
Edd Robinson 077cbba0e8 Fix index tests 2017-12-12 21:25:35 +00:00
Ben Johnson 288c5217e8
Fix tsi1 tools. 2017-12-08 16:12:33 -07:00
Edd Robinson f6835632e7 Merge master into branch 2017-12-08 17:11:07 +00:00
Edd Robinson 3318c94a2f Clean up 🛁: 2017-12-08 11:38:53 +00:00
Ben Johnson 0e0e7cfc08
Fix tests. 2017-12-07 09:59:39 -07:00
Ben Johnson 37803d6803
Fixed 'tests' pkg. 2017-12-07 08:33:47 -07:00
Ben Johnson c36817fffc
Fix retain/release hang issues. 2017-12-06 09:09:41 -07:00
Ben Johnson f9807a635c
Merge branch 'er-tsi-index-part' of https://github.com/influxdata/influxdb into er-tsi-index-part 2017-12-05 12:10:17 -07:00
Ben Johnson 493c1ed0d1
inmem tests passing. 2017-12-05 10:49:58 -07:00
Edd Robinson ca1cfe4b81 Refactor File mock 2017-12-05 16:17:15 +00:00
Ben Johnson f5f85d65f9
Fixing more tests. 2017-12-04 10:29:04 -07:00
Ben Johnson e0df47d54f
Fixing up tests. 2017-12-02 16:52:34 -07:00
Edd Robinson 1e891b5fbc Change logging level 2017-11-30 14:08:44 +00:00
Jason Wilder 5cf7d52694 Ensure series keys are sorted
The Measurement added series keys from a map where the iteration
order is non-deterministic.  The keys should be returned in sorted
order.
2017-11-29 11:24:10 -07:00
Ben Johnson ca09f18e65
intermediate: tsdb compile 2017-11-29 11:20:18 -07:00
Edd Robinson 81976bca59 Refactor based on new design 2017-11-28 17:54:29 +00:00
Edd Robinson 38e0dd695f Allow concurrent access to Engine Index 2017-11-28 15:57:03 +00:00
Edd Robinson abae36f992 Ensure all index fields set 2017-11-28 15:57:02 +00:00
Edd Robinson 368420c670 Fix test due to index changes 2017-11-28 15:57:02 +00:00
Edd Robinson 67c67aeb34 Update test for Windows 2017-11-28 15:57:02 +00:00
Edd Robinson 12a2ff7fac Add support for TSI shard streaming and shard size
This commit firstly ensures that a shard's size on disk is accurately
reported when using the tsi1 index, by including the on-disk size of the
tsi1 index in the calculation.

Secondly, this commit add support for shard streaming/copying when using
the tsi1 index. Prior to this, a tsi1 index would not be correctly
restored when streaming shards.
2017-11-28 15:57:02 +00:00
Ben Johnson cc22134d8f
Merge branch 'er-tsi-index-part' of https://github.com/influxdata/influxdb into er-tsi-index-part 2017-11-27 07:52:39 -07:00
Ben Johnson 01491ca4f4
intermediate 2017-11-27 07:52:18 -07:00
Edd Robinson 4831545830 Add PR typo/doc changes 2017-11-27 14:05:30 +00:00
Ben Johnson fc966a1b67
Add series file backup/restore. 2017-11-22 08:55:54 -07:00
Edd Robinson 68dd5e27c8 Improve performance of TagKeys 2017-11-21 17:16:47 +00:00
Edd Robinson 5ff96d9193
Merge pull request #9127 from influxdata/er-fga
Implement FGA on SHOW COMMANDS
2017-11-20 14:51:58 +00:00
Jason Wilder 0551849298 Reduce calls to time.Now()
These were showing up in profiles during heavy write load.
2017-11-17 14:23:02 -07:00
Edd Robinson a5af19fc06 Address PR feedback 2017-11-17 12:43:48 +00:00
Edd Robinson bff69f7a82 Refactor inmem implementation 2017-11-17 11:06:43 +00:00
Edd Robinson 25f0fedd6f Fix MeasurementNamesByExpr in tsi1 2017-11-17 11:06:43 +00:00
Edd Robinson 3967e78885 Consolidate tests to tsdb package 2017-11-17 11:06:43 +00:00
Edd Robinson b3407c5d46 Correct authorisation on inmem SHOW MEASUREMENTS 2017-11-17 11:06:43 +00:00
Edd Robinson d4cecd7cc7 Add index authorisation test coverage 2017-11-17 11:06:43 +00:00
Edd Robinson 6851db3fc9 Add FGA support to SHOW MEASUREMENTS 2017-11-17 11:06:43 +00:00
Edd Robinson aa17ef55f9 Implement FGA on SHOW SERIES 2017-11-17 11:06:43 +00:00
Edd Robinson 8acab9b5ac Fix existing bug where database was empty 2017-11-17 11:06:43 +00:00
Ben Johnson ede3fcf98e
intermediate 2017-11-15 16:09:25 -07:00
Ben Johnson ba4c9e0317
Merge remote-tracking branch 'upstream/master' into er-tsi-index-part 2017-11-14 16:14:13 -07:00
Jason Wilder 8b18cc4456 Optimize deletes in tsi
The DropSeries code path ended up creating a MeasurementSeriesIterator
for each dropped series, this was too expensive just to see if a
series exists.

This adds a HasSeries func and fixes and issue where TSI files were
compacted while an iterator was still in use causing a panic.
2017-11-13 12:35:38 -07:00
Jason Wilder 13692639cb Fix create/delete series race
This fixes a race where writes and deletes to the same series and
measurements could sometimes leave the index in an inconsistent state.
2017-11-13 09:02:10 -07:00
Jason Wilder 80cd5e63af Optimize DeleteSeriesRange
This removes more allocations and speeds up some critical sections.
2017-11-13 09:02:10 -07:00
Jason Wilder f893beb6d8 Use MeasurementSeriesKeysByExprIterator for deletes 2017-11-13 09:02:10 -07:00
Jason Wilder 16d1f4309b Extract MeasurementSeriesKeysByExprIterator 2017-11-13 09:02:10 -07:00
Ben Johnson 9756a29678
import fix 2017-11-13 08:54:32 -07:00
Jonathan A. Sternberg 0b7c56bcd8 Update the zap logger dependency
The previous sha was taken from a revision on a devel branch that I
thought would continue staying in the tree after it was merged. That
revision was rebased away and the API was changed for the logger.

This updates the usage of the logger and adds a simple package for
constructing the base logger.

The 1.0 version of zap changed the format of the default console logger
so this change moves over to this new logger instead of attempting to
retain backwards compatibility with the old format.
2017-11-10 16:27:16 -06:00
Ben Johnson e278af2b18
intermediate 2017-11-09 09:30:19 -07:00
Ben Johnson d3cd750509
Refactor series file tombstoning. 2017-11-09 09:30:19 -07:00
Ben Johnson 3034d3fb54
intermediate 2017-11-09 09:30:19 -07:00
Ben Johnson 919f99f34d
Fixing tests. 2017-11-09 09:30:19 -07:00
Ben Johnson 07a743cca7
Rebase fixes 2017-11-09 09:29:19 -07:00
Edd Robinson 4471341d7e
Ensure error channel has capacity for all partitions 2017-11-09 09:28:37 -07:00
Ben Johnson b24b08a23c
Fix partition loading. 2017-11-09 09:28:37 -07:00
Ben Johnson 1f6d4ed1d1
Add series map. 2017-11-09 09:28:37 -07:00
Edd Robinson 49218fd3bd
Fix issue with series being added to log file 2017-11-09 09:28:37 -07:00
Edd Robinson 87778f3c45
Open partitions in parallel 2017-11-09 09:28:37 -07:00
Edd Robinson 3ae799b3a5
WIP Fix build 2017-11-09 09:28:37 -07:00
Edd Robinson ebb23df1cf
Implement most merge based methods 2017-11-09 09:28:37 -07:00
Edd Robinson 6d87ff7fa2
WIP - series point iterator 2017-11-09 09:28:37 -07:00
Edd Robinson aec607bddf
Implement Measurement sketches 2017-11-09 09:28:37 -07:00
Edd Robinson b39aa858cf
Implement series creation 2017-11-09 09:28:37 -07:00
Edd Robinson 65c6fa747e
Implement methods that don't require merge 2017-11-09 09:28:37 -07:00
Edd Robinson bf132004a3
Implement basic partition layout 2017-11-09 09:28:37 -07:00
Edd Robinson 7aa9de508d
Initial refactor of tsi1.Index
This commit carries out the initial refactor of the tsi1.Index into
tsi1.Partition. We then create a new tsi1.Index that will be an
abstraction over a collection of Partitions.
2017-11-09 09:27:56 -07:00
Edd Robinson fb646549f4
Index files -> partition files 2017-11-09 09:26:06 -07:00
Ben Johnson 328bffd658
Convert series ids to 64-bits. 2017-11-09 09:26:06 -07:00
Ben Johnson 0ffd94a37a
Fix rebase 2017-11-09 09:25:10 -07:00
Ben Johnson 08e459357a
Fix tsi race conditions. 2017-11-09 09:18:33 -07:00
Ben Johnson c75f1127aa
intermediate 2017-11-09 09:18:33 -07:00
Ben Johnson f223153078
Initial working version of series file. 2017-11-09 09:18:33 -07:00
Ben Johnson e05d4fdeeb
intermediate 2017-11-09 09:18:33 -07:00
Ben Johnson 9ad2b53881
intermediate 2017-11-09 09:18:33 -07:00
Ben Johnson 7259589241
intermediate 2017-11-09 09:18:33 -07:00
Ben Johnson 48b48a8927
intermediate 2017-11-09 09:13:46 -07:00
Ben Johnson 156f25ac23
Improve SHOW TAG KEYS performance. 2017-11-07 10:59:19 -07:00
Edd Robinson 07c4fdc1ed Fix data race on SeriesPointIterator 2017-11-07 10:48:23 +00:00
Stuart Carnie f3d45ba301 influxdata/influxdb/influxql -> influxdata/influxql 2017-10-30 14:40:26 -07:00
Ben Johnson 49c1fca036
Handle nil MeasurementIterator. 2017-10-26 11:25:46 -06:00
Edd Robinson 47bd069315 Fix race in Measurement index
Fixes #8989 and #8633.

Previously when issuing commands involving a regex check, walking
through the tags keys/values on a measurement, using the measurement's
index, would be racy.

This commit adds a new `TagKeyValue` type that abstracts away the
multi-layer map we were using as an inverted index from tag keys and
values to series ids. With this abstraction we can also make concurrent
access to this inverted index goroutine safe.

Finally, this commit fixes a very old bug in the index which will affect
any query using a regex. Previously we would always check _every_ tag
against a regex for a measurement, even when we had found a match.
2017-10-25 13:34:21 +01:00
Ben Johnson 5a77238f30
Sort & validate TSI key value insertion. 2017-10-23 10:46:01 -06:00