Commit Graph

1686 Commits (1975940f767a72746a3df0c6047751356e97032a)

Author SHA1 Message Date
Jason Wilder 888689f5d3 Move values loop under type switch
All the values read must be of the same type so repeatedly using
the type switch is confusing and less efficiient.
2017-04-20 13:39:49 -06:00
Jason Wilder b0988511bf Use fixed size array instead of slice 2017-04-20 13:38:33 -06:00
Jason Wilder da6bdfdda8 Use bufio.Reader when reading wal segments
Reduces disk IO due to small reads.
2017-04-20 13:33:42 -06:00
Jason Wilder 8e9cbd7ffc Simplify WALSegmentReader.UnmarshalBinary
There were two loops over nvals which created some extra allocation
which coudl be replaced with a simplet slice capacity and append.
2017-04-20 13:33:42 -06:00
Jason Wilder 02b663b651 Fix lock contention in Index.CreateSeriesListIfNotExists
There was contention on the write lock which only needs to be acquired
when checking to see if the log file should be rolled over.
2017-04-20 12:28:42 -06:00
Jason Wilder 40ec85aacd Fix lock contention in LogFile.SeriesWithBuffer
Under high write load, the check for each series was done sequentially
which caused a lot of CPU time to acquire/release the RLock on LogFile.

This switches the code to check multiple series at once under an RLock
similar to the chang for inmem.
2017-04-20 12:28:42 -06:00
Jason Wilder 0e715b5b74 Reduce lock contention on MeasurementFields 2017-04-20 12:28:42 -06:00
Jason Wilder ef65ee77f4 Switch WAL byte pools to sync/pool
The current bytes.Pool will hold onto byte slices indefinitely. Large
writes can cause the pool to hold onto very large buffers over time.
Testing w/ sync/pool seems to perform similarly now so using a sync/pool
will allow these buffers to be GC'd when necessary.
2017-04-20 12:28:42 -06:00
Jason Wilder d155d37ca8 Reduce TSM write buffer
When many TSM files are being compacted, the buffers can add up fairly
quickly.
2017-04-20 12:28:42 -06:00
Jason Wilder 3c2825a851 Reduce lock thrashing when checking series
The inmem index would call CreateSeriesIfNotExist for each series
which takes and releases and RLock to see if a series exists. Under
high write load, the lock shows up in profiles quite a bit.  This
adds a filtering step that obtains a single RLock and checks all the
series and returns the non-existent series to contine though the slow
path.
2017-04-20 12:28:41 -06:00
Jason Wilder d7c5dd0a3e Reduce wal sync goroutine churn
Under high write load, the sync goroutine would startup, and end
very frequently.  Starting a new goroutine so frequently adds a small
amount of latency which causes writes to take long and sometimes timeout.

This changes the goroutine to loop until there are no more waiters which
reduce the churn and latency.
2017-04-20 12:28:34 -06:00
Jason Wilder aa9925621b Fix deadlock in wal
If the sync waiters channel was full, it would block sending to the
channel while holding a the wal write lock.  The sync goroutine would
then be stuck acquiring the write lock and could not drain the channel.

This increases the buffer to 1024 which would require a very high write
load to fill as well as retuns and error if the channel is full to prevent
the blocking.
2017-04-19 11:33:13 -06:00
Jason Wilder a19ce9c10f Reduce index lock contention
Series and Measurment have their own locks and we do not need to
hold locks on the index while using those types.
2017-04-18 16:32:33 -06:00
Jason Wilder 883b3dcbbb Reduce lock content in AssignShard
The lock shows up under write load.  It only needs to be assigned
once so a read lock eliminates the contention.
2017-04-18 16:32:33 -06:00
Jason Wilder 5c51ae7319 Merge branch '1.2' into jw-merge-123 2017-04-14 14:36:54 -06:00
Jason Wilder ff1270dfeb Fix dropping fields created data corruption
The Point is intended to be immutable after being parsed since it
is shared by several goroutines.  When dropping a field (e.g. time),
corrupted data can result if one goroutine is delete the field
while another is marshaling the underlying byte slices.

To avoid this, the shard will just skip invalid fields and series
instead of trying to mutate them by deleting them.
2017-04-07 12:58:42 -06:00
Jason Wilder 1de99cd219 Merge pull request #8268 from influxdata/jw-dedup-measurements
Ensure MeasurementNames deduplicates measurements across shards
2017-04-06 13:10:56 -06:00
Jason Wilder 927acb5ab9 Ensure MeasurementNames deduplicates measurements across shards 2017-04-06 12:17:29 -06:00
Jason Wilder cf100647e0 Fix deadlock in Measurement.SeriesIDsAllOrByExpr
SeriesIDsAllOrByExpr took a RLock and ended up calling SeriesIDs
which can take a Lock causing a deadlock.
2017-04-05 16:22:45 -06:00
Ben Johnson 9c97cd8601
Merge remote-tracking branch 'upstream/master' into tsi 2017-04-04 12:46:09 -06:00
Ben Johnson 0d74497abe
Reset rhh map elements to reuse allocations. 2017-04-04 11:57:37 -06:00
Ben Johnson 6ff27c95e5
Fix tsi assertions. 2017-04-04 11:29:21 -06:00
Ben Johnson dbc10559c4 Merge pull request #8247 from benbjohnson/tsi-series-block-partitioning
TSI Series Block Partitioning
2017-04-04 11:15:10 -06:00
Jason Wilder 5fa8073fc2 Merge branch '1.2' into jw-merge-123 2017-04-04 11:12:06 -06:00
Jason Wilder 84cbee227a Fix file store not close all TSM files
Regression added via #8192
2017-04-04 10:58:51 -06:00
Ben Johnson 95d4016ff2
Merge branch 'tsi' of https://github.com/influxdata/influxdb into tsi-series-block-partitioning 2017-04-04 10:14:03 -06:00
Jason Wilder ec7eea2a0f Skip tests on appveyor that we skip w/ -race 2017-04-04 09:47:59 -06:00
Ben Johnson bf49b176f5
Partition tsi1 series index. 2017-04-04 09:46:04 -06:00
Jason Wilder 793635dbd7 Skip TSI cardinality tests on appveyor 2017-04-04 09:19:43 -06:00
Jason Wilder 4f850b5cff Skip TestCache_Deduplicate_Concurrent on windows 2017-04-04 08:48:55 -06:00
Jason Wilder 7ac3c9a26f Remove unused cardinality func 2017-04-03 11:24:55 -06:00
Jason Wilder fcdc3c5c21 Remove commented out code in meta 2017-04-03 11:22:59 -06:00
Jason Wilder 8da84e6144 Merge branch 'master' into tsi 2017-04-03 11:21:02 -06:00
Jason Wilder 68f73e64d1 Lazily sort Measurement.SeriesIDs
Removing series while trying to maintain the sorted series list
does not perform well when removing many series.  This causes
drop DB, RP, series, to be very slow in some cases.

Instead, lazily create a sorted series list when first requested and
invalidate it when dropping series.
2017-04-03 08:57:53 -06:00
Jason Wilder 32c4d43952 Speed up drop measurement
This reworks drop measurement to use a sorted list of series keys
instead of creating an intermediate map.  It remove allocations
and some extra garbage that is created during drop measurement.
2017-04-03 08:57:53 -06:00
Jason Wilder a78da51b7c Use buffered writer when writing tombstones
When deleting many series, the many small writes flood the disks
and consume a lot of CPU time.
2017-04-03 08:57:52 -06:00
Jason Wilder 6232d5e56d Remove defer allocations in TSMReader 2017-04-03 08:57:52 -06:00
Jason Wilder 920c8396c6 Use sorted merge in FileStore.WalkKeys
WalkKeys serially walked each TSM file and invoked fn for each key.
Caller needed to handle duplicate calls to fn with the same key
because the same key could exist in multiple TSM files.  The serial
execution was also slower.

Since the series keys are already sorted, we can iterate over all
files in parallel and skip duplicates using a sorted merge.  This
fixes the duplicate invocation issue as well as speeds up walking
all keys.

This can significant improve startup performance when many TSM files
exists that may not have been fully compacted.  This also has benefits
for deletes (measurements/series) since duplicates are removed saving
extra allocations and work.  This may also allow for the optimize
compaction to be removed provided startup times are fast enough.
2017-04-03 08:57:52 -06:00
Edd Robinson 5e342a2ddd Ensure shared index removed on database drop
When using the inmem index, if one drops a database, and then creates it
again, the previous index object will be reused. This includes the
previous cardinality estimation sketches, leading to inaccurate
cardinality estimations.
2017-03-30 13:05:31 +01:00
Edd Robinson ddf7f0fd7b Remove uncalled method 2017-03-30 12:48:22 +01:00
Edd Robinson fddaff2cc8 Merge master in 2017-03-29 18:00:28 +01:00
Edd Robinson 116230b427 Use varint for tag count 2017-03-29 16:31:13 +01:00
Edd Robinson 45f843fc91 Don't unassign shards when system shutting down 2017-03-29 11:57:38 +01:00
Ben Johnson 2edfb1c92d
Ignore series limit on database load. 2017-03-24 16:27:16 -06:00
Jason Wilder ee03fbb164 Fix series tombstone sketch not updated when dropping measurment 2017-03-24 15:49:00 -06:00
Ben Johnson d2b396bff5
Fix database series limit, remove shard series limit. 2017-03-24 13:16:00 -06:00
Ben Johnson 9fb8f1ec1d
Fix database and tag limits. 2017-03-24 09:48:10 -06:00
Jason Wilder 631681796d Remove tsl file committed by mistake 2017-03-23 16:18:27 -06:00
Jason Wilder 7119ef8f29 Merge pull request #8193 from influxdata/jw-123-backports
1.2.3 backports
2017-03-23 13:31:35 -06:00
Jason Wilder ca1919e5de Use standard merge algorithm for merging values
The previous version was very innefficient due to the benchmarks used
to optimize it having a bug.  This version always allocates a new
slice, but is O(n).
2017-03-23 12:53:59 -06:00
Jason Wilder ba2571903d Fix broken Values.Merge benchmark
Merge had the side effect of modifying the original values so
the results are wrong because they always hit the fast path
after the first run.
2017-03-23 12:53:50 -06:00
Jason Wilder 890ffb4ce8 Generate encode*Values funcs 2017-03-23 12:53:29 -06:00
Jason Wilder ced953ae89 Use typed values to avoid allocations
This switches compactions to use type values (FloatValues) from the
generic Values type.  It avoids a bunch of allocations where each value
much be converted from a specific type to an interface{}.
2017-03-23 12:53:17 -06:00
Jason Wilder a1c84ae6f3 Add block type for BlockIterator 2017-03-23 12:49:17 -06:00
Jason Wilder 2972a3f223 Remove MMAP derefencing code
This code was added to address some slow startup issues.  It is believed
to be the cause of some segfault panic's that occur at query time when
the underlying MMAP array has been unmapped.  The current structure of
code makes this change unnecessary now.
2017-03-23 12:46:23 -06:00
Jason Wilder 61f80db1b9 Skip cardinaltiy dups on circle race test 2017-03-22 15:20:38 -06:00
Jason Wilder c443e639b0 Fix 32bit alignment issue in wal.sync 2017-03-22 11:21:29 -06:00
Jason Wilder 06306bbd97 Run store tests in parallel 2017-03-22 10:55:38 -06:00
Edd Robinson 1c4ecb12c1 Don't panic on nil engine 2017-03-22 10:07:29 -06:00
Ben Johnson 85c1ae4dd4
Remove sort.Slice 2017-03-21 12:33:05 -06:00
Ben Johnson afe41f1c80
Fix tsm1/tsi1 broken tests. 2017-03-21 12:21:48 -06:00
Ben Johnson 1e9fa7bc2c
Fix 32-bit rhh implementation. 2017-03-21 11:44:13 -06:00
Jason Wilder 58c8736ebc Merge pull request #8172 from influxdata/jw-dropped-points
Fix series not getting created
2017-03-21 09:44:31 -06:00
Jason Wilder 92c722913d Unlock index if dropping non-existent series 2017-03-21 09:19:44 -06:00
Ben Johnson 5cf41adcb8
Optimize tsi1 write path.
Removes sorted series list in log, uses a buffer for HasSeries(),
and prepends a length for series key encoding.
2017-03-21 08:44:35 -06:00
Edd Robinson 47448646ed Remove DropSeries on index 2017-03-21 11:35:31 +00:00
Edd Robinson f89de550ed Significantly speed up DROP DATABASE 2017-03-21 11:35:31 +00:00
Jason Wilder 7f78c6ad8e Fix series not getting created 2017-03-20 17:19:22 -06:00
Jason Wilder 8f7b251afd Merge branch 'master' into jw-tsi 2017-03-20 17:17:26 -06:00
Jason Wilder 8177df2dab Simplify Measurement.TagSets signature 2017-03-17 16:19:10 -06:00
Jason Wilder 2d5d899ac2 Allow queries to be interrupted during planning
If a bad query is run, kill query and limits would not kick in until
after it started executing.  Some bad queries that involve high
cardinality can cause the server to OOM just from planning which
defeats the purpose of the max-select-series limit.

This change primarily fixes max-select-series limit so that the query
is killed earlier and has the side effect that kill query now can kill
a query while it's being planned.
2017-03-17 16:00:54 -06:00
Jason Wilder bc4aeefbed Check max-series-limit in shard iterator creation
The limit waited until all the iterators had been created which still
allows problem queries to be planned.  This allows the queries to be
aborted much earlier in some cases.
2017-03-17 16:00:25 -06:00
Ben Johnson 5d67c424bf
Refactor tsi1 write locking. 2017-03-17 11:20:50 -06:00
Ben Johnson 70efc70abe
Reduce lock contention, fix rhh lookup. 2017-03-17 09:44:11 -06:00
Jason Wilder 27ae2929fc Add wal-fsync-delay to Diagnostics 2017-03-15 16:31:03 -06:00
Jason Wilder e9eb925170 Coalesce multiple WAL fsyncs
Fsyncs to the WAL can cause higher IO with lots of small writes or
slower disks.  This reworks the previous wal fsyncing to remove the
extra goroutine and remove the hard-coded 100ms delay.  Writes to
the wal still maintain the invariant that they do not return to the
caller until the write is fsync'd.

This also adds a new config options wal-fsync-delay (default 0s)
which can be increased if a delay is desired.  This is somewhat useful
for system with slower disks, but the current default works well as
is.
2017-03-15 16:31:03 -06:00
Jason Wilder 7bd1bd8ab3 Only calculate disk size if shard has changed
Calling DiskSize can be expensive with many shards.  Since the stats
collection runs this every 10s by default, it can be expensive and
wasteful to calculate the stats when nothing has changed. This avoids
re-calculating the shard size unless something has chagned.
2017-03-15 16:29:57 -06:00
Ben Johnson 1807772388
Fix tsi tests. 2017-03-15 11:23:58 -06:00
Ben Johnson ee2e046853
Merge remote-tracking branch 'upstream/tsi-log-compact' into tsi 2017-03-15 10:22:32 -06:00
Ben Johnson cf7ba96377
Merge branch 'tsi-log-compact' into tsi 2017-03-15 10:18:40 -06:00
Ben Johnson 358b1e0b05
Merge remote-tracking branch 'upstream/master' into tsi 2017-03-15 10:13:32 -06:00
Jason Wilder 65464ea0d1 Merge pull request #8131 from influxdata/jw-values-merge
Use standard merge algorithm when merging Values
2017-03-15 09:51:21 -06:00
Jason Wilder a4cfeacedb Use standard merge algorithm for merging values
The previous version was very innefficient due to the benchmarks used
to optimize it having a bug.  This version always allocates a new
slice, but is O(n).
2017-03-15 08:59:41 -06:00
Edd Robinson 7d997d508a Fixes #8138 2017-03-15 12:50:22 +00:00
Edd Robinson ddcea1c322 WHY YOU SMITE ME BEN. B. JOHNSON? 2017-03-15 12:50:03 +00:00
Jason Wilder 4d37c9dc9e Fix broken Values.Merge benchmark
Merge had the side effect of modifying the original values so
the results are wrong because they always hit the fast path
after the first run.
2017-03-14 14:20:24 -06:00
Mark Rushakoff 535cf597f1 Report subset of config values in SHOW DIAGNOSTICS
This includes hand-selected config settings that are safe to expose and
not expected to include any kind of secrets.

Fixes #7821
2017-03-14 11:34:19 -07:00
Jason Wilder ca9c67a877 Generate encode*Values funcs 2017-03-14 11:54:53 -06:00
Ben Johnson d23f2971c3
Refactor TagBlockEncoder. 2017-03-10 10:08:16 -07:00
Jason Wilder 2f7d4995b4 Use typed values to avoid allocations
This switches compactions to use type values (FloatValues) from the
generic Values type.  It avoids a bunch of allocations where each value
much be converted from a specific type to an interface{}.
2017-03-09 16:27:07 -07:00
Jason Wilder 78b7815c49 Add block type for BlockIterator 2017-03-09 09:16:59 -07:00
Jason Wilder b9e5375043 Merge branch '1.2' into jw-merge-12 2017-03-08 13:16:50 -07:00
Jason Wilder 394bca3aad Validate field type when creating new fields 2017-03-06 16:13:17 -07:00
Jason Wilder 37187cbe6d Delete series under fields lock
Still seeing the panic that switching this logic around was supposed
to fix.  We now delete the bulk of data outside of the fields lock
and then again, under the write lock, to ensure that the field mapping
is accurate.  We don't do the full delete under the lock because it
can block writes and queries that require a read lock.
2017-03-06 14:19:55 -07:00
Jason Wilder 675d7c9d65 Merge branch '1.2' into jw-merge12 2017-03-06 11:09:05 -07:00
Jason Wilder eab012ef61 Fix points missing after compaction
If blocks containing overlapping ranges of time where partially
recombined, it was possible for the some points to get dropped
during compactions.  This occurred because the window of time of
the points we need to merge did not account for the partial blocks
created from a prior merge.

Fixes #8084
2017-03-06 10:17:11 -07:00
Jason Wilder 3c70abf061 Delete series before remove from field index
There is a race where the field type can be deleted while a new type
is written and during a query.  When this happens, an iterator for
the new type is created but old data make still exist in the cache
for TSM files causing a panic.
2017-03-06 09:38:27 -07:00
Jason Wilder 29f8d8de76 Fix race in WALEntry.Encode and Value.Deduplicate
Under high query load, a race exists in the cache and the WAL.  Since
writes currently hit the cache first, they are availble for query before
they hit the WAL.  If the WAL is writing and accessign the Value slice
at the same time that a query is run that needs to dedup the same slice,
a race occurs.

To fix this, the cache now just copies the values instead of storing the
slice passed in.  Another way to fix this might be to have the writes go
to the wal before the cache.  I think the latter would be better, but it
introduces some larger write path issues that we'd need to also address.
e.g. if the cache was full, writes to the WAL would need to be rejected
to avoid filling the disk.

Copying the slice in the cache is simpler for now and does not appear to
dramatically affect performance.
2017-03-06 09:38:22 -07:00
Ben Johnson 4c202eea09
Re-check field type under write lock. 2017-03-03 09:47:43 -07:00
Jonathan A. Sternberg 1081785cb4 Treat non-reserved measurement names with underscores as normal measurements
A measurement name that begins with an underscore and does not conflict
with one of the reserved measurement names will now be passed untouched
to the underlying shards rather than being intercepted as an empty
measurement.

A user still shouldn't rely on measurements that begin with underscores
to always be accessible, but this will prevent the most common use case
from causing unexpected behavior since we will very rarely, if ever, add
additional system sources.
2017-02-27 16:49:02 -06:00