Commit Graph

10360 Commits (2bb6db3f376eb10af0f3e3e5d5d2d12f2568a148)

Author SHA1 Message Date
Carol (Nichols || Goulding) 72ea8c09c9 fix: Move a vec allocation outside of the benchmarked code
This is consistent with the rest of the decode benchmarks and I think
matches the benchmark intentions best.
2020-02-12 12:45:43 -05:00
Carol (Nichols || Goulding) c498d1f524 fix: Remove truncate from encoding benchmark
The first thing the `encode` function does is truncate the `dst` buffer,
so this should never be necessary inside the code being benchmarked for
testing encoders.
2020-02-12 12:42:54 -05:00
Carol (Nichols || Goulding) 1fc46c33f3 refactor: Call the general encoding benchmarking fn for CPU values 2020-02-12 11:46:49 -05:00
Carol (Nichols || Goulding) b36c4b9672 refactor: Extract shared benchmarking of encoding
Benchmarking random values was more general than sequential since it
takes an arbitrary function to create the decoded values; express
sequential in terms of random and change the name of random to be
general benchmarking of encoding.
2020-02-12 11:44:08 -05:00
Carol (Nichols || Goulding) 2080bfc5c4 refactor: Extract a fn for benchmarking encoding of random values 2020-02-12 11:41:25 -05:00
Carol (Nichols || Goulding) 532329f83e refactor: Extract a fn for benchmarking encoding of sequential values 2020-02-12 11:30:21 -05:00
Carol (Nichols || Goulding) 85b5d339a9 refactor: Extract batch sizes into constants
Exposes which tests use which batch sizes more clearly; names of
constants could be improved.
2020-02-12 11:14:57 -05:00
Carol (Nichols || Goulding) e361cded92 refactor: Move all encoder benchmarks to one file 2020-02-12 11:08:07 -05:00
Carol (Nichols || Goulding) 6fbe9167ae refactor: Extract large constant to a separate module 2020-02-12 10:28:00 -05:00
Carol (Nichols || Goulding) 28d03c4047
Merge pull request #31 from influxdata/cn-small-piece
Refactoring for generics
2020-02-12 09:49:50 -05:00
Carol (Nichols || Goulding) 5942dd5c8a fix: Remove turbofish that are no longer needed 2020-02-12 09:46:29 -05:00
Carol (Nichols || Goulding) 64223b70a9 refactor: Collapse the read_*_range functions 2020-02-12 09:43:42 -05:00
Carol (Nichols || Goulding) 3399cea18a refactor: Extract a trait to make read_*_range fns more similar 2020-02-12 09:43:42 -05:00
Carol (Nichols || Goulding) 16c8834fbc refactor: Collapse read_*_range functions into a generic function 2020-02-12 09:43:40 -05:00
Carol (Nichols || Goulding) 2b642ffaac refactor: Make read_*_bytes more similar by extracting a trait 2020-02-12 09:42:47 -05:00
Carol (Nichols || Goulding) af85249ea6 fix: Remove unneeded lifetime annotations 2020-02-12 09:42:47 -05:00
Carol (Nichols || Goulding) 07bb075e93 refactor: Extract storing different types in SeriesData 2020-02-12 09:42:47 -05:00
Carol (Nichols || Goulding) 0b515fe1f9 fix: Switch from Copy to Clone bounds 2020-02-12 09:42:46 -05:00
Carol (Nichols || Goulding) daa02069db refactor: Remove unused function 2020-02-12 09:42:42 -05:00
Carol (Nichols || Goulding) 867523c2d9 refactor: Extract the code for storing types' bytes in RocksDB 2020-02-12 09:36:53 -05:00
Jake Goulding 5774414a23
Merge pull request #29 from influxdata/enum-int-mapping
fix: Do not transmute unknown bytes to enums
2020-02-11 20:54:08 -05:00
Jake Goulding 657059af9f fix: Do not transmute unknown bytes to enums
Fixes #24
2020-02-11 20:47:29 -05:00
Jake Goulding 461ead862b
Merge pull request #25 from influxdata/reduce-vec-creation
perf: Reduce amount of Vecs created in the RocksDB code
2020-02-11 20:45:59 -05:00
Jake Goulding 26a6d1a272
Merge pull request #26 from influxdata/ok-or-else
refactor: Use Option::ok_or_else in RocksDB adapter code
2020-02-11 20:45:50 -05:00
Jake Goulding bfef773109
Merge pull request #27 from influxdata/double-hash
perf: Avoid calculating the hashcode twice in the RocksDB adapter
2020-02-11 20:45:37 -05:00
Jake Goulding d248c3e7f2 refactor: Use Option::ok_or_else in RocksDB adapter code
This helper reduces the boilerplate of creating errors for a missing
value.
2020-02-11 20:41:31 -05:00
Jake Goulding b0b8925379 perf: Avoid creating a vector for a subslice 2020-02-11 20:40:21 -05:00
Jake Goulding 2f63ca7fdb perf: Remove unneeded Vec clone 2020-02-11 20:40:16 -05:00
Jake Goulding 959f98f605 perf: Reduce unneeded Vec creation
- Integers can be directly converted to arrays of bytes
- We can extend vectors from other slices instead of `Vec`s
2020-02-11 20:40:12 -05:00
Jake Goulding be3ed216c3 perf: Avoid taking `Vec` by reference
There's no benefit to accepting a reference to a `Vec` over a slice.

Further details available in https://stackoverflow.com/q/40006219/155423
2020-02-11 20:40:06 -05:00
Jake Goulding b44f7d8869 perf: Avoid calculating the hashcode twice in the RocksDB adapter
Unfortunately, we can't use `Entry::or_insert_with` because we need to
use the key to construct the value.
2020-02-11 20:39:23 -05:00
Jake Goulding 8c963ff7d1
Merge pull request #28 from influxdata/rust-2018-idioms
style: Enforce Rust 2018 idioms
2020-02-11 08:15:56 -05:00
Jake Goulding b5879c8414 style: Enforce Rust 2018 idioms
For reference, the [2018 edition guide][guide] talks about some of the
big differences.

The two that are applied here are:

1. `extern crate` is basically not needed at all anymore; you can do
`use cratename` instead. This makes importing things more uniform
between your own crate and other crates.

1. Rust does a reasonable amount of [*lifetime elision*][elision] so
    we don't have to type `'a` in as many places. However, one that
    ended up tripping up people is when a generic lifetime was part of
    a type. The compiler cared about this lifetime, but since it
    wasn't visible, people would forget it's there, then try to use it
    as if it wasn't constrained by the lifetime.

    A good example is the `Chars` iterator. It references the original
    `&str` and cannot live longer than the string. With the original
    way this was being passed (`&mut Chars`) it was visually evident
    that there was *some* lifetime, thanks to seeing the `&`, but it
    wasn't obvious that there's *another* lifetime — the string.

    With the addition of the *anonymous lifetime* (`'_`), it's now
    encouraged to use that when a type has a lifetime parameter that
    isn't relevant to prevent confusing mistakes that lead to compiler
    errors.

There are probably a few more things enabled by the lint as well. I
forget the exact reason that these are not yet enabled by default,
though.

[guide]: https://doc.rust-lang.org/edition-guide/rust-2018/index.html
[elision]: https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#lifetime-elision
2020-02-11 08:08:12 -05:00
Jake Goulding 05bd782423
Merge pull request #30 from influxdata/build
ci: Get the build back to green
2020-02-11 07:52:24 -05:00
Jake Goulding d8f3c31b8c ci: Prevent rustfmt and clippy from stopping the build for now 2020-02-09 22:23:27 -05:00
Jake Goulding 81468f9e5e ci: Adjust builds to run on highly parallel CI machines
By default, the RocksDB C library is compiled using the number of
cores on the machine. In CircleCI, this is 36 cores. Unfortunately,
that appears to completely blow out the memory usage, causing the C
compiler invocations to be killed.

This commit reduces the *entire* parallelism of the build to avoid
that.
2020-02-09 22:22:10 -05:00
Edd Robinson 7a9f69c921 refactor: address PR feedback 2020-02-07 22:28:11 +00:00
AJ d268087b24
Merge pull request #23 from influxdata/chore/add-circleci
chore(circleci): setup circleci
2020-02-07 17:23:36 -05:00
AJ Bond 7a65f5550f
chore(circleci): add tiered cache configuration 2020-02-07 12:25:32 -05:00
AJ Bond c2de967e36
chore(circleci): setup circleci
This configures a circleci pipeline that runs fmt, lint, test, and build operations.
I changed the fmt command to --check since --overwrite is no longer supported.
The pipeline will always run on nightly builds of rust.
2020-02-07 11:46:47 -05:00
Edd Robinson 5327efd926 test: add timestamp benchmarks 2020-02-07 13:26:59 +00:00
Edd Robinson bd8246d561 test: add integer encoder/decoder benchmarks 2020-02-07 13:11:38 +00:00
Edd Robinson 4185307d78 test: add float encoder/decoder bencmarks
This commit adds benchmarks for the float encoder and decoder. The
following scenarios are benchmarked:

- sequential values;
- random values;
- real CPU values (from Telegraf output).

Each scenario is benchmarked with a variety of block sizes.
2020-01-21 15:01:35 +00:00
Paul Dix 418b89a87b
Merge pull request #20 from influxdata/pd-add-stubs-for-other-datastores
feat: add memory backed inverted index and series store
2020-01-13 10:35:41 -05:00
Paul Dix f1329a8a08 Remove unused code 2020-01-13 10:34:41 -05:00
Paul Dix bec6b3cf9c Add MemDB and test framework
This commit adds a test framework for the InvertedIndex and SeriesStore traits. Structs that implement these traits can call into their tests to ensure they work.

This commit also adds an in memory database that keeps time series data in a ring buffer per series and an inverted index in memory using a combination of HashMap and BTreeMap.
2020-01-13 10:31:47 -05:00
Paul Dix 80493ba517 Refactor database for arbitrary backends
This commit pulls the database behavior into three traits: ConfigStore, InvertedIndex, and SeriesStore. The ConfigStore holds the Bucket definitions, the InvertedIndex holds the index of measurement names, tags, and fields, and the SeriesStore holds the raw time series data indexed only by bucket id, series id, and time.

This is the initital work to enable memory and S3 backed databases. It is also the preliminary work to break data out into shards.
2020-01-13 10:31:47 -05:00
Edd Robinson 62a0f26066 refactor: src doesn't need to be mutable 2020-01-13 14:59:51 +00:00
Edd Robinson 79592e7e11
Merge pull request #19 from influxdata/er-float-encoder
feat: add float encoder and decoder
2020-01-13 09:57:29 +00:00
Edd Robinson c1492994e0 refactor: tidy up 2020-01-12 15:37:30 +00:00