Commit Graph

656 Commits (3769ad3d216f9f5eeb38d414f661d80d763e197c)

Author SHA1 Message Date
Jake Goulding c85f4b45ed refactor: use raw strings instead of escape sequences 2020-02-21 09:55:18 -05:00
Jake Goulding 07b65e32d5 refactor: Remove unneeded String allocation 2020-02-21 09:33:11 -05:00
Jake Goulding 7620adfc70 refactor: Prefer mutable slices over a mutable Vec
We never add anything to the collection, so don't allow for the
possibility.
2020-02-21 09:02:19 -05:00
Carol (Nichols || Goulding) 998537da0f refactor: Use the newer style of module file naming scheme
This cuts down on the number of mod.rs files open in your text editor at
the same time.
2020-02-19 08:43:06 -05:00
Carol (Nichols || Goulding) cb62590582 style: Warn on unnecessary use of iter and fix all cases
This is more idiomatic than calling `iter` explicitly.
2020-02-19 08:37:39 -05:00
Carol (Nichols || Goulding) c7a7dde51f fix: Remove redundant clones in tests
As suggested by the redundant_clone clippy lint.

This is a little weird because different constructions of `points`
now need clones in different places, but rustc will tell us when
we need to clone and clippy will tell us when we don't.
2020-02-19 08:37:39 -05:00
Carol (Nichols || Goulding) 196b364a00 fix: Assert on an expected error rather than failing if we get Ok
The clippy lints assertions_on_constants and single_match were pointing
at this spot, so it's as good a time as any to take care of the TODO!
2020-02-19 08:37:39 -05:00
Carol (Nichols || Goulding) cdca82fee5 fix: Remove an unnecessary conversion
As found by the identity_conversion clippy lint. Clippy was warning
about this on master, but warnings didn't fail the build until a commit
I'm adding in this PR.
2020-02-19 08:37:39 -05:00
Carol (Nichols || Goulding) 41ddab1a54 fix: Change an arbitrary test input so that clippy doesn't think it looks like pi
I don't think we actually want to use a precise value for pi, but if we
put 3.14, the approx_constant clippy lint will suggest it.
2020-02-19 08:37:11 -05:00
Carol (Nichols || Goulding) eabf96a0d7 fix: Test that floats are approximately equal within machine epsilon
As found by the float_cmp clippy lint.

There are crates that provide macros for this, but it's small enough
that I think a test helper function in-tree is fine.
2020-02-19 08:37:11 -05:00
Carol (Nichols || Goulding) c6eda3a8b5 fix: Allow excessive precision for now
The way these tests are written, any truncation will be made on both the
values used in the test and the values expected in the test, so this is
probably fine.

But we might not be testing with the exact values we think we're testing
with, so we should audit the values clippy warns about at some point.
2020-02-19 08:37:11 -05:00
Carol (Nichols || Goulding) 7cc871b305 style: Parenthesize potentially confusing expression
As suggested by the clippy precedence lint
2020-02-19 08:37:11 -05:00
Carol (Nichols || Goulding) dd7d6e838f style: Allow unreadable literals in tests 2020-02-19 08:37:11 -05:00
Jake Goulding 3effd368d3 feature: Start gRPC server on a different port from the hyper server 2020-02-17 16:37:43 -05:00
Carol (Nichols || Goulding) 9dc3699466 style: cargo fmt 2020-02-17 10:53:57 -05:00
Jake Goulding 155bfcbd4f build: Update prost to latest version 2020-02-17 10:48:33 -05:00
Carol (Nichols || Goulding) 23a9a800a6 Merge remote-tracking branch 'origin/master' into hyper 2020-02-17 08:54:19 -05:00
Carol (Nichols || Goulding) b6184cb778 Merge remote-tracking branch 'origin/master' into er-encoder-bench 2020-02-17 08:47:38 -05:00
Carol (Nichols || Goulding) bbbbf8ee07 fix: Remove unnecessary allocation 2020-02-17 08:10:53 -05:00
Edd Robinson 1ad21b3e90 refactor: apply clippy 2020-02-14 18:19:51 +00:00
Carol (Nichols || Goulding) 6fc6fc3329 Merge remote-tracking branch 'origin/master' into er-encoder-bench 2020-02-14 12:47:33 -05:00
Edd Robinson 92baa3d7e8 refactor: apply clippy 2020-02-14 17:13:20 +00:00
Edd Robinson b2cdd299f5 refactor: apply clippy 2020-02-14 17:13:05 +00:00
Carol (Nichols || Goulding) 4dfd4d90ba fix: Use BytesMut directly rather than through actix 2020-02-14 10:56:37 -05:00
Carol (Nichols || Goulding) f7b33d47de fix: Adjust parameter type to avoid double allocation 2020-02-14 10:19:39 -05:00
Carol (Nichols || Goulding) 12fbb23112 fix: Make both query parsing places return bad request on failure 2020-02-14 10:17:48 -05:00
Carol (Nichols || Goulding) dc7a2ec333 fix: Improve parameter type 2020-02-14 10:02:35 -05:00
Carol (Nichols || Goulding) a16c49537f fix: Include limit in size exceeded error 2020-02-14 10:00:35 -05:00
Carol (Nichols || Goulding) 8b1255be9d refactor: Switch to a hyper server 2020-02-14 09:59:09 -05:00
Carol (Nichols || Goulding) 062bbc5a34 Merge remote-tracking branch 'origin/master' into er-encoder-bench 2020-02-14 09:15:24 -05:00
Jake Goulding 615e0f6537 style: Apply rustfmt defaults to the entire project 2020-02-14 08:02:11 -05:00
Carol (Nichols || Goulding) 77125bd8e5 improvement: Remove TODO comments that are now done 2020-02-13 10:47:01 -05:00
Carol (Nichols || Goulding) 3577916307 Merge remote-tracking branch 'origin/master' into er-encoder-bench 2020-02-12 13:25:33 -05:00
Carol (Nichols || Goulding) 5942dd5c8a fix: Remove turbofish that are no longer needed 2020-02-12 09:46:29 -05:00
Carol (Nichols || Goulding) 64223b70a9 refactor: Collapse the read_*_range functions 2020-02-12 09:43:42 -05:00
Carol (Nichols || Goulding) 3399cea18a refactor: Extract a trait to make read_*_range fns more similar 2020-02-12 09:43:42 -05:00
Carol (Nichols || Goulding) 16c8834fbc refactor: Collapse read_*_range functions into a generic function 2020-02-12 09:43:40 -05:00
Carol (Nichols || Goulding) 2b642ffaac refactor: Make read_*_bytes more similar by extracting a trait 2020-02-12 09:42:47 -05:00
Carol (Nichols || Goulding) af85249ea6 fix: Remove unneeded lifetime annotations 2020-02-12 09:42:47 -05:00
Carol (Nichols || Goulding) 07bb075e93 refactor: Extract storing different types in SeriesData 2020-02-12 09:42:47 -05:00
Carol (Nichols || Goulding) 0b515fe1f9 fix: Switch from Copy to Clone bounds 2020-02-12 09:42:46 -05:00
Carol (Nichols || Goulding) daa02069db refactor: Remove unused function 2020-02-12 09:42:42 -05:00
Carol (Nichols || Goulding) 867523c2d9 refactor: Extract the code for storing types' bytes in RocksDB 2020-02-12 09:36:53 -05:00
Jake Goulding 657059af9f fix: Do not transmute unknown bytes to enums
Fixes #24
2020-02-11 20:47:29 -05:00
Jake Goulding 461ead862b
Merge pull request #25 from influxdata/reduce-vec-creation
perf: Reduce amount of Vecs created in the RocksDB code
2020-02-11 20:45:59 -05:00
Jake Goulding 26a6d1a272
Merge pull request #26 from influxdata/ok-or-else
refactor: Use Option::ok_or_else in RocksDB adapter code
2020-02-11 20:45:50 -05:00
Jake Goulding d248c3e7f2 refactor: Use Option::ok_or_else in RocksDB adapter code
This helper reduces the boilerplate of creating errors for a missing
value.
2020-02-11 20:41:31 -05:00
Jake Goulding b0b8925379 perf: Avoid creating a vector for a subslice 2020-02-11 20:40:21 -05:00
Jake Goulding 2f63ca7fdb perf: Remove unneeded Vec clone 2020-02-11 20:40:16 -05:00
Jake Goulding 959f98f605 perf: Reduce unneeded Vec creation
- Integers can be directly converted to arrays of bytes
- We can extend vectors from other slices instead of `Vec`s
2020-02-11 20:40:12 -05:00
Jake Goulding be3ed216c3 perf: Avoid taking `Vec` by reference
There's no benefit to accepting a reference to a `Vec` over a slice.

Further details available in https://stackoverflow.com/q/40006219/155423
2020-02-11 20:40:06 -05:00
Jake Goulding b44f7d8869 perf: Avoid calculating the hashcode twice in the RocksDB adapter
Unfortunately, we can't use `Entry::or_insert_with` because we need to
use the key to construct the value.
2020-02-11 20:39:23 -05:00
Jake Goulding b5879c8414 style: Enforce Rust 2018 idioms
For reference, the [2018 edition guide][guide] talks about some of the
big differences.

The two that are applied here are:

1. `extern crate` is basically not needed at all anymore; you can do
`use cratename` instead. This makes importing things more uniform
between your own crate and other crates.

1. Rust does a reasonable amount of [*lifetime elision*][elision] so
    we don't have to type `'a` in as many places. However, one that
    ended up tripping up people is when a generic lifetime was part of
    a type. The compiler cared about this lifetime, but since it
    wasn't visible, people would forget it's there, then try to use it
    as if it wasn't constrained by the lifetime.

    A good example is the `Chars` iterator. It references the original
    `&str` and cannot live longer than the string. With the original
    way this was being passed (`&mut Chars`) it was visually evident
    that there was *some* lifetime, thanks to seeing the `&`, but it
    wasn't obvious that there's *another* lifetime — the string.

    With the addition of the *anonymous lifetime* (`'_`), it's now
    encouraged to use that when a type has a lifetime parameter that
    isn't relevant to prevent confusing mistakes that lead to compiler
    errors.

There are probably a few more things enabled by the lint as well. I
forget the exact reason that these are not yet enabled by default,
though.

[guide]: https://doc.rust-lang.org/edition-guide/rust-2018/index.html
[elision]: https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#lifetime-elision
2020-02-11 08:08:12 -05:00
Edd Robinson 7a9f69c921 refactor: address PR feedback 2020-02-07 22:28:11 +00:00
Edd Robinson 5327efd926 test: add timestamp benchmarks 2020-02-07 13:26:59 +00:00
Edd Robinson 4185307d78 test: add float encoder/decoder bencmarks
This commit adds benchmarks for the float encoder and decoder. The
following scenarios are benchmarked:

- sequential values;
- random values;
- real CPU values (from Telegraf output).

Each scenario is benchmarked with a variety of block sizes.
2020-01-21 15:01:35 +00:00
Paul Dix f1329a8a08 Remove unused code 2020-01-13 10:34:41 -05:00
Paul Dix bec6b3cf9c Add MemDB and test framework
This commit adds a test framework for the InvertedIndex and SeriesStore traits. Structs that implement these traits can call into their tests to ensure they work.

This commit also adds an in memory database that keeps time series data in a ring buffer per series and an inverted index in memory using a combination of HashMap and BTreeMap.
2020-01-13 10:31:47 -05:00
Paul Dix 80493ba517 Refactor database for arbitrary backends
This commit pulls the database behavior into three traits: ConfigStore, InvertedIndex, and SeriesStore. The ConfigStore holds the Bucket definitions, the InvertedIndex holds the index of measurement names, tags, and fields, and the SeriesStore holds the raw time series data indexed only by bucket id, series id, and time.

This is the initital work to enable memory and S3 backed databases. It is also the preliminary work to break data out into shards.
2020-01-13 10:31:47 -05:00
Edd Robinson 62a0f26066 refactor: src doesn't need to be mutable 2020-01-13 14:59:51 +00:00
Edd Robinson c1492994e0 refactor: tidy up 2020-01-12 15:37:30 +00:00
Edd Robinson 2a1ddf5669 feat: add float encoder 2020-01-12 15:37:26 +00:00
Edd Robinson 80ff911259 feat: automatically create db dir and test dir 2020-01-06 15:55:22 +00:00
Edd Robinson b06d1005a8 refactor: rustfmt 2020-01-06 13:49:39 +00:00
Paul Dix af9e1cae32 Fix race condition on new series in RocksDB
There was a race condition when inserting new series into RocksDB that would cause series being inserted by two separate threads to cause an error. The thread that inserts is fine, but the one after would see that it inserted, but not read in the ID so it can later be used to write the points.
2020-01-05 19:34:05 -05:00
Paul Dix 04bd6b8a12 Update line protocol parser
The earlier version of this line protocol parser incorrectly used a space as a delimiter between fields. This updates it to use a comma as it is in InfluxDB 1.x and 2.x.
2020-01-05 19:32:35 -05:00
Paul Dix ce69c5a9ce Update read API to return float series
Updates to read API in main.rs to return values for float series. I'm not terribly happy with the way I had to do this, but I was struggling a bit with the type system gymnastics. I assume I'll have to revisit this anyway when I add support for other storage backends.
2020-01-05 19:31:42 -05:00
Paul Dix b784563792 Add support for float64 to rocksdb
Adds support for for f64 time series in RocksDB. Series data types are now stored in the index under the id to key mapping, which is now id to type and key.

This doesn't enforce the same data type for values being written into a series, which will come later. Also later will be adding support for float64 series in the read API.
2020-01-05 18:38:45 -05:00
Paul Dix 8754911b5a Add support for float64 to line protocol parser
Adds support for f64 to the line protocol parser. Also updates the return value of parse to return a Vec of mixed type points that can be later written into the database.

The PointType struct is only for use in this context. In the context of querying or working with time series for compaction, we'll want vectors of actual typed points of the same kind so we don't have to do inefficient enum matches.
2020-01-05 16:28:12 -05:00
Paul Dix c7a862dca0 Fix compile warnings from the Rust linter
This commit fixes all the linter warnings. However, there are a number of spots, particularly in the encoders where I added `#[allow(dead_code)]` to get past them. We should go through and fix those up at some point, I'll log an issue to track.
2020-01-05 13:44:03 -05:00
Paul Dix 1a851f8d0b Add basic read endpoint
This commit adds a basic read endpoint to pull data out of the database. In order to provide the basic functionality a few things were added:

* Time package with limited support for parsing Flux style durations
* API endpoint at /api/v2/read with query paramters of org_id, bucket_name, predicate, start, and stop

The start and stop query parameters only support relative durations.

The predicate parameter supports what is possible in the parse_predicate method and in the RocksDB implementation (only == comparisons on tags and AND or OR)
2020-01-04 19:07:54 -05:00
Paul Dix fe9cb87c3d Update rocks db series filter
Add the series key to the SeriesFilter struct that is returned when searching the index.
2020-01-03 17:35:18 -05:00
Paul Dix 4265e7b11b Update write API endpoint
Upates the actix-web and actix-rt versions to 2.0 and 1.0 respectively. Wires up the write endpoint to create buckets if they don't exist.
2020-01-03 17:35:18 -05:00
Paul Dix 4892a87898 Implement read range on the database
This commit adds iterators for iterating over series and batches of points for a read range request. The exact signature/structure of the iterators are likely to change when this is generalized for other data types and other storage backends (S3 & memory).
2020-01-03 17:35:18 -05:00
Paul Dix c76ce39da5 Update rocksdb.rs write_points
This commit updates the write_points method to use the bucket id and series id in the key for a stored point value.

It also updates the Database methods to be immutable borrows moving any mutable concerns into interior structures so it can be easily called from many threads.
2020-01-03 17:35:18 -05:00
Paul Dix 3c8f93a9a7 Implement OR in predicate
Adds support for logical OR operator in predicates against the index.
2019-12-24 15:25:43 -05:00
Paul Dix f3807456c9 Implement AND in predicate
Adds support for logical AND operator in predicates against the index.
2019-12-24 15:21:04 -05:00
Paul Dix 6cd4c5b583 Add basic tag key/value index
This commit brings in a Roaring Bitmap implementation to keep postings lists of tag key/value pairs to the set of series ids that have those pairs. The croaring implementation was used becasue the Treemap was required for u64 support for series ids and it was serializable (unlike the other pure Rust roaring implementation).

This doesn't shard the postings lists based on size. It also doesn't implement the time/index levels.

The predicate matching currently only works for a simple key = "value" match.
2019-12-24 13:44:30 -05:00
Paul Dix f77b0a3842 Move storage module into directory
Moved the storage module into its own directory. Split into the rocksdb portion of the code and the predicate parsing.
2019-12-23 11:49:58 -05:00
Paul Dix 71fe0aa71c Update storage with predicate parsing
Adds a basic predicate parser to make testing the index easier
2019-12-23 11:36:12 -05:00
Paul Dix 54ef130cea Wire up get tag values in storage
This adds basic support for getting tag values in storage. Still needs to add predicate and time range support.
2019-12-20 13:46:41 -05:00
Paul Dix 7effec0f48 Add shell of index functions and get tag keys
Added the shell of index functions to return series IDs that match a predicate, tag keys with a predicate, and tag values with a predicate.
2019-12-20 13:07:03 -05:00
Paul Dix 7f2f4eaceb Update mod.rs
Add parser for key/value pairs to be indexed. Measurement and field are represented as _m and _f respectively.
2019-12-20 13:05:59 -05:00
Paul Dix 5d80d5e100 feat(storage): Add series to ID index
This commit is the beginning of the RocksDB based index for series and their tag metadata.

I started to stub out different index levels but stopped short of implementing them.

There are a number of spots where I'm unwrapping return values that we may want to revisit later. For now I want to have the program panic if those things pop up.
2019-12-19 15:58:00 -05:00
Paul Dix 1a10243b46 Update integer.rs
Fix build error in test
2019-12-19 15:50:22 -05:00
Paul Dix 617c2960a8 feat(storage): Implement bucket definitiions and persistence
This updates to build system to use Prost to build the protobuf objects.

It adds tests for creating, storing and loading bucket definitions.

The tests use an actual on disk RocksDB implementation to ensure that its tested all the way to persistence.
2019-12-17 17:01:41 -05:00
Paul Dix 100d192538 Update main.rs
Fix build warnings for unused imports.
2019-12-17 17:01:41 -05:00
Paul Dix 3856af8842 Update timestamp.rs
Update test to fix build error.
2019-12-17 17:01:41 -05:00
Edd Robinson 64efa35dea refactor: tidy up comments 2019-12-17 12:10:27 +00:00
Edd Robinson cf64234c2d feat: integer encoder 2019-12-16 17:06:04 +00:00
Edd Robinson 36d3138a3d refactor: timestamp encoder 2019-12-13 16:43:29 +00:00
Edd Robinson d9b966579e refactor: move timestamp-specific RLE 2019-12-13 11:12:56 +00:00
Paul Dix 9cadb1bb52 Add server skeleton with Actix and RocksDB 2019-12-12 10:15:16 -05:00
Edd Robinson 0627ea0e5b refactor: organise encoders 2019-12-12 13:47:14 +00:00
Edd Robinson 8662982233 feat(encoders): add timestamp block encoder 2019-12-11 19:21:56 +00:00
Edd Robinson bff8704a0b test: more simple8b tests 2019-12-11 19:21:38 +00:00
Edd Robinson 3bbc86a8a0 fix: fix RLE bug 2019-12-11 19:21:15 +00:00
Edd Robinson e9db04292c refactor: change simple8b API to use binary encoding 2019-12-10 20:05:41 +00:00
Edd Robinson fb83e9c7fa feat(encoding): add RLE encoder/decoder 2019-12-09 18:04:01 +00:00
Edd Robinson 4009e67bf3 refactor: change encoder/decoder API 2019-12-08 21:00:45 +00:00
Edd Robinson 55d711599e tests: add tests for simple8b 2019-12-04 13:14:37 +00:00
Edd Robinson 824044eac1 refactor: move simple8b encoder/decode 2019-12-04 13:14:20 +00:00
Edd Robinson 46bbe4d317 feat(encoders): add simple8b encoder/decoder 2019-12-03 13:46:36 +00:00
Edd Robinson df355da693 refactor: comment out imports 2019-12-03 13:46:09 +00:00
Paul Dix 7a122b23cf Add shell for line protocol parser and encoders 2019-11-22 17:06:34 -05:00
Paul Dix b9b5a815b7 Initial commit with some notes and proto 2019-11-22 16:59:04 -05:00