Commit Graph

602 Commits (d9a60d7e4c32ce75ca9ef214c48f1ffd1296ad39)

Author SHA1 Message Date
Jason Wilder ae925625ce Merge pull request #4451 from influxdb/jw-int64
Int64 encoding enhancements
2015-10-15 13:44:55 -06:00
Jason Wilder 3db28cfe99 Fix typos 2015-10-15 13:36:42 -06:00
Nathaniel Cook d338f38d65 update reduce func signatures to not have influxql.Call arguments 2015-10-15 09:24:16 -06:00
Daniel Morsing 13f6fd6238 Add measurement reference back to CQs
CQs has a feature where you can tell it to use the name of the originating measurement as the target name. This was missed when I reimplemented it.
2015-10-15 13:10:38 +00:00
Daniel Morsing d990b5f28d fix into queries when encountering nil values.
For aggregate queries, having a null result means that you haven't
got any data for that time period. CQs used this as a signal that
the measurement was not created and dropped the entire write.

INTO queries can have any structure, including wildcards, so dropping
the entire query isn't going to work. Instead, just drop the nulls
returned.
2015-10-15 12:12:40 +00:00
Jason Wilder c037ae1416 Add run-length encoding to int64 encoder
This encoding can really help counter values with large runs of the same
value (e.g. a counter that is 0 for long periods of time).
2015-10-14 19:23:58 -06:00
Jason Wilder 014c51f32a Delta-encode int64 before bit packing
This will help large integer counters type fields that increment by
small amounts over time.  Instead of storing the larger raw value
in a compressed format, we store the difference from the prior value
in compressed format which allows the value to be stored using
fewer bits.
2015-10-14 19:23:51 -06:00
Sean Beckett 82f104a8b1 Merge pull request #4436 from influxdb/tag-names-to-keys
WIP tag name --> tag key, field name --> field key
2015-10-14 16:02:46 -07:00
Ben Johnson f2d23b070b add tsm1 wal quickcheck
This commit adds quickcheck testing for the tsm1 WAL.
2015-10-14 09:38:38 -06:00
Daniel Morsing 6d188d9703 Merge pull request #4409 from influxdb/intoq
wire up INTO queries.
2015-10-14 15:29:54 +01:00
Philip O'Toole 6d38559f24 RLock access to store's shards
Fix issue #4442.
2015-10-13 20:28:19 -07:00
Sean Beckett 1d83c8c427 Update meta.go 2015-10-13 16:46:59 -07:00
Daniel Morsing 62dff895e2 wire up INTO queries.
Since INTO queries need to have absolute information about the database
to work, we need to create a loopback interface back to the cluster
in order to perform them.
2015-10-13 15:00:36 +00:00
Cory LaNou 6787525912 fixes multiple selectors overwriting each other. fixes #4360 2015-10-12 21:40:57 -05:00
Jason Wilder 16b3084ca9 Merge pull request #4397 from influxdb/jw-tsmdump
tsm1 file dump
2015-10-12 08:12:23 -06:00
Paul Dix a99c9ec5af Ensure rewrite index doesn't write corrupt file.
Fixes #4401
2015-10-10 16:58:25 -04:00
Jason Wilder 629219951a Fix timestamp encoding not using run-length encoding when possible
influx_inpsect uncovered some scenarios where timestamps could be stored using
run-length encoding but were being stored using simple8 which uses more space.
2015-10-09 22:38:17 -06:00
David Norton 9627f5ab84 fix #4280: add comments based on code review 2015-10-09 18:38:33 -04:00
David Norton 512d6ac050 fix #4280: only drop points matching WHERE clause 2015-10-09 18:34:32 -04:00
Jason Wilder 758359accc Prevent panic in DecodeSameTypeBlock
If DecodeSameTypeBlock is called on on an empty Values slice, it would
panic with an index out of bounds error.  This func can actually be removed
because DecodeBlock can determine what type of values are encoded already.

This will still panic if the block cannot be decoded due to other reasons.

Fixes #4365
2015-10-09 12:52:23 -06:00
Ben Johnson 2b3bb5336d add tsm1 quickcheck tests 2015-10-08 11:59:57 -06:00
Jason Wilder 79185fc1dc Fix index out of bounds panic in int64Decoder
Code was missing a check for when we did not have anymore bytes to decode
so it panic when we tried to decode the empty slice.
2015-10-08 11:21:19 -06:00
Jason Wilder b3343a6d2a Fix similar float values encoding overflow
If similar float values were encoded, the number of leading bits would
overflow the 5 available bits to store them (e.g. store 33 in 5 bits).  When
decoding, the values after the overflowed value would spike to very large and
small values.

To prevent the overflow, we clamp the value to 31 which is the maximum
number of leading zero bits we can encoded.

Fixes #4357
2015-10-07 15:05:56 -06:00
Nathaniel Cook 6f1a44bd07 Merge pull request #4345 from influxdb/remove_iterator
tsdb.Iterator is no longer used. Removing
2015-10-06 17:00:50 -06:00
Paul Dix f041939a1c Merge pull request #4308 from influxdb/pd-storage-engine
The TSM storage engine
2015-10-06 15:54:56 -07:00
Paul Dix b11308133a Only limit field count for non-tsm engines 2015-10-06 15:49:37 -07:00
Paul Dix 40ff4f4a86 Change default to bz1 2015-10-06 15:30:34 -07:00
Jason Wilder 41e3294d4a Fix panic: assignment to entry in nil map
Closing the store did not properly return an error for in-flight
writes because the closing channel was set to nil when closed.  A
nil channel is not selectable so writes continue on past the guard
checks and trigger panics.
2015-10-06 14:03:52 -06:00
Paul Dix be477b2aab Fix cursor bug on index 2015-10-06 12:26:45 -07:00
Nathaniel Cook d380ee37a2 tsdb.Iterator is no longer used. Removing 2015-10-06 10:34:07 -06:00
Paul Dix 267f34b94e Updates based on PR feedback 2015-10-05 20:09:56 -04:00
Paul Dix 26a93ec23e Fix deletes not kept if shutdown before flush on tsm1 2015-10-05 20:09:56 -04:00
Paul Dix bb398daf75 Updates based on @otoolp's PR comments 2015-10-05 20:09:56 -04:00
Jason Wilder c6f2f9cec2 Avoid duplicating values slice when encoding 2015-10-05 20:09:56 -04:00
Jason Wilder cb28dabf62 Make DecodeBlock panic if block size is too small
Should never get a block size 9 bytes since Encode always returns the min
timestampe and a 1 byte header.  If we get this, the engine is confused.
2015-10-05 20:09:56 -04:00
Jason Wilder b0449702e5 Fix comment typos 2015-10-05 20:09:56 -04:00
Paul Dix d9f94bdeeb Add db crash recovery 2015-10-05 20:09:56 -04:00
Jason Wilder 1d754db00b Propogate all encoding errors to engine
Avoid panicing in lower level code and allow the engine to decide what
it should do.
2015-10-05 20:09:56 -04:00
Jason Wilder 4c54c78009 Move compression encoding constants to encoders
Will make it less error-prone to add new encodings int the future
since each encoder has it's set of constants.  There are some placeholder
contants for uncompressed encodings which are not in all encoder currently.
2015-10-05 20:09:56 -04:00
Jason Wilder b1a57e1628 Fix go vet errors 2015-10-05 20:09:56 -04:00
Jason Wilder ab791ba913 Fix TestStoreOpenShardCreateDelete
Shard path can be a directory.
2015-10-05 20:09:56 -04:00
Paul Dix d47ddb5454 Cleanup after pd1 -> tsm1 name change. 2015-10-05 20:09:55 -04:00
Paul Dix 594253cbba Rename storage engine to tsm1, for Time Structured Merge Tree! 2015-10-05 20:09:55 -04:00
Paul Dix 0a11a2fdbc Add deletes to new storage engine 2015-10-05 20:09:55 -04:00
Paul Dix 4beca1a245 Implement reverse cursor direction on pd1 2015-10-05 20:09:55 -04:00
Jason Wilder dbf6228817 Fix go vet 2015-10-05 20:09:55 -04:00
Jason Wilder d9499f0598 Remove zig zag encoding from timestamp encoder
Not needed since all timestamps will be sorted in ascending order.  Negatives
are not possible.
2015-10-05 20:09:55 -04:00
Paul Dix a2b139e006 Fix compaction and multi-write bugs.
* Fix bug with locking when the interval completely covers or is totally inside another one.
* Fix bug with full compactions running when the index is actively being written to.
2015-10-05 20:09:55 -04:00
Jason Wilder 2366baaf0b Handle partial reads when loading WAL
If reading into fixed sized buffer using io.ReadFull, the func can
return io.ErrUnexpectedEOF if the read was short.  This was slipping
through the error handling causing the shard to fail to load.
2015-10-05 20:09:55 -04:00
Paul Dix 3332236527 Fix bugs with writing old data and compaction. 2015-10-05 20:09:55 -04:00
Jason Wilder 5d938d0a8b Add test with duplicate timestamps
Should not happen but makes sure that the same values are encoded
and decoded correctly.
2015-10-05 20:09:55 -04:00
Jason Wilder c47d14540d Add compressed string encoding
Uses snappy to compress multiple strings into a block
2015-10-05 20:09:55 -04:00
Paul Dix 861a15b3e6 Fix panic when data file has small index 2015-10-05 20:09:55 -04:00
Paul Dix be011b8da9 Add logging to pd1 2015-10-05 20:09:54 -04:00
Paul Dix c1213ba367 Update WAL to deduplicate values on Cursor query.
Added test and have failing section for single value encoding.
2015-10-05 20:09:54 -04:00
Jason Wilder 9f9692acdf Rename float encoding tests 2015-10-05 20:09:54 -04:00
Jason Wilder a4d92162ef Add documentation about compression 2015-10-05 20:09:54 -04:00
Jason Wilder 2da52ec4fe Fix deadlock in pd1_test.go
The defer tx.Rollback() tries to free the queryLock but the defer e.Cleanup() runs
before it and tries to take a write lock on the query lock (which blocks) and prevents
tx.Rollback() from acquring the read lock.
2015-10-05 20:09:54 -04:00
Jason Wilder 7e0df18e1a Update simple8b api usage 2015-10-05 20:09:54 -04:00
Jason Wilder cb23f5ac53 Add a compressed boolean encoding
Packs booleans into bytes using 1 bit per value.
2015-10-05 20:09:54 -04:00
Jason Wilder 1196587dc4 Keep track of the type of the block encoded
Allowes decode to decode an arbitrary block correctly.
2015-10-05 20:09:54 -04:00
Jason Wilder 731ae27123 Remove unnecessary allocations from int64 decoder
The decoder was creating a large slice and decoding all values when
instead, it could decode one packed value as needed.
2015-10-05 20:09:54 -04:00
Jason Wilder 95046c1e37 Add test assertions for time encoding type 2015-10-05 20:09:54 -04:00
Jason Wilder e42d8660d0 Fix run length encoding check
Values were run length encoded even when they should not have been
2015-10-05 20:09:54 -04:00
Jason Wilder 092689c131 Reduce memory allocations
Converting between different encoders is wasting a lot of memory allocating different
typed slices.
2015-10-05 20:09:54 -04:00
Jason Wilder ce1d45ecda Use zigzag encoding for timestamp deltas
Previously were using a frame of reference approach where we would
transform the (possibly negative) deltas into positive values from
the minimum.  That required an extra pass over the values as well
as a large slice allocation so we could encode the originals in uncompressed
form if they were too large.

This switches the encoding to use zigzag encoding for the deltas which
removes the extra slice allocation as well as the extra loops.

Improves encoding performane by ~4x.
2015-10-05 20:09:54 -04:00
Jason Wilder 4a37ba868d Add int64 compression
This is using zig zag encoding to convert int64 to uint64s and then using simple8b
to compress them, falling back to uncompressed if the value exceeds 1 << 60.  A
patched encoding scheme would likely be better in general but this provides decent
compression for integers that are not at the ends of the int64 range.
2015-10-05 20:09:53 -04:00
Jason Wilder 42e1babe7f Add time and float compression
Time compression uses an adaptive approach using delta-encoding,
frame-of-reference, run length encoding as well as compressed integer
encoding.

Float compression uses an implementation of the Gorilla paper encoding
for timestamps based on XOR deltas and leading and trailing null suppression.
2015-10-05 20:09:53 -04:00
Jason Wilder 112a03f24c Fix go vet errors 2015-10-05 20:08:58 -04:00
Jason Wilder 88248f3f81 Ensure we have files when iterating in cursor
Prevents index out of bounds panic
2015-10-05 20:08:58 -04:00
Paul Dix db4ad33f3c Update tests to use transactions. Add test for single series 10k points. 2015-10-05 20:06:22 -04:00
Paul Dix 0b33a71bb7 Add recover to maintenance. Change snapshot writer to not use bolt on shard. 2015-10-05 20:06:22 -04:00
Paul Dix 0fd116d1f2 Ensure data files can't be deleted while query is running.
Also ensure that queries don't try to use files that have been deleted.
2015-10-05 20:06:22 -04:00
Paul Dix b1bdb4f15a Make compaction run at most at set duration. 2015-10-05 20:06:22 -04:00
Paul Dix 1c8eac1523 Add PerformMaintenance to store for flushes and compactions.
Also fixed shard to work again with b1 and bz1 engines.
2015-10-05 20:06:22 -04:00
Paul Dix d694454f47 Fix wal flushing, compacting, and write lock 2015-10-05 20:06:22 -04:00
Paul Dix 6c94e738a0 Add support for multiple fields 2015-10-05 20:06:22 -04:00
Paul Dix 667b3e6c08 Handle hash collisions on keys 2015-10-05 20:06:22 -04:00
Paul Dix 48069e782c Add compaction and time range based write locks. 2015-10-05 20:06:22 -04:00
Paul Dix 2eb2a647d6 Add multicursor to combine wal and index 2015-10-05 20:06:22 -04:00
Paul Dix 7baba84a21 Ensure we don't have duplicate values. Fix panic in compaction. 2015-10-05 20:06:22 -04:00
Paul Dix 0770ccc87d Make writes to historical areas possible 2015-10-05 20:06:21 -04:00
Paul Dix 982c28b947 Update to work with new cursor definitiono and Point in models 2015-10-05 20:06:21 -04:00
Paul Dix 365a631b53 Update wal to only open new segment file on flush if its not an idle flush 2015-10-05 20:06:21 -04:00
Paul Dix 7c8ab4f1d8 Add test for close and restart of engine and fix errors. 2015-10-05 20:06:21 -04:00
Paul Dix c5f6c57d7f Update engine to put index at the end of data files 2015-10-05 20:06:21 -04:00
Paul Dix fe1f9a51e5 Add memory settings and WAL backpressure 2015-10-05 20:06:21 -04:00
Paul Dix 5e59cb9393 Update encoding test to work with new interface. 2015-10-05 20:06:21 -04:00
Paul Dix 2100e66437 Add full durability to WAL and flush on startup 2015-10-05 20:06:21 -04:00
Paul Dix 82e1be7527 WIP: more WAL work 2015-10-05 20:06:21 -04:00
Paul Dix 2ba032b7a8 WIP: finish basics of PD1. IT WORKS! (kind of) 2015-10-05 20:06:21 -04:00
Paul Dix 7555ccbd70 WIP: engine work 2015-10-05 20:06:21 -04:00
Paul Dix 12ea1cb26f Add comment about encoding float 2015-10-05 20:06:21 -04:00
Paul Dix fb2a1cb2f3 WIP: skeleton for encoding for new engine 2015-10-05 20:06:20 -04:00
David Norton 4375545064 fix #4276: walk DropSeriesStatement 2015-10-05 19:56:30 -04:00
Philip O'Toole 2ac0357406 Support dropping non-Raft nodes 2015-10-04 00:19:52 -07:00
Philip O'Toole d74e0690c7 Revert "Merge pull request #4233 from influxdb/drop-server"
This reverts commit 0bdb36f6dc, reversing
changes made to 3085fbc138.
2015-10-02 08:39:57 -07:00
Cory LaNou f50813460e protobuf update.. :-( 2015-10-01 15:39:15 -05:00
Jason Wilder 06c143c2dd Handle missing values in aggregate derivative queries better
If an aggregate derivative query did not have a value in the first
time bucket, it would abort early and return a single row with value
of 0.  Similarly, if either the current or previous value was nil,
it would skip the row and not append any values causing gaps and
no data to show up.

Instead, this will append a nil value if either the current or previous
valis is nil.  This essentially allows nil values to carry through the
results as well as gives a more sensible value for rows where we cannot
compute a difference (instead of dropping the row as before).

Fixes #4237 #4263
2015-10-01 13:05:19 -06:00
Ben Johnson 343dd23ee7 refactor map functions to use list of values
This commit changes `tsdb.mapFunc` to use `tsdb.MapInput` instead
of an iterator. This will make it easier and faster to pass blocks
of values from the new storage engine into the engine.
2015-09-29 14:00:33 -06:00