Commit Graph

199 Commits (9ec0ae26e18da84bf09ac6c5b4d50ec51234d3ed)

Author SHA1 Message Date
Paul Dix 1e966b5153 feat: implement API for storing the server configuration in object storage
This adds basic API calls for persisting and loading the server configuratioon of database rules and host groups to and from object storage. It stores all the data in a single JSON file.
2020-10-26 13:43:43 -06:00
Andrew Lamb ef501871bb
feat: remove partition_store (#387) 2020-10-26 14:39:38 -04:00
Andrew Lamb 4e1e8dbf79
chore: Upgrade version of Arrow/DataFusion (2 of 3) (#391)
* chore: Upgrade version of Arrow/DataFusion (2 of 3)

* fix: Fixup error type usage and use async stream interface

* fix: post merge fixups
2020-10-26 13:49:16 -04:00
Andrew Lamb 88b9f43110
chore: Upgrade version of Arrow/DataFusion (1 of 3) (#390)
* chore: Upgrade version of Arrow/DataFusion

* fix: update code for deps
2020-10-26 11:46:02 -04:00
Andrew Lamb 1004854403
refactor: remove uneeded dependencies, switch to tracing from log (#388) 2020-10-26 06:15:47 -04:00
Andrew Lamb 0ef76db208
feat: implement series_query for write buffer database, tests for same (#360)
* feat: implement series_query for write buffer database, tests for same

* fix: fixup comments

* fix: sort field columns too
2020-10-15 17:23:14 -04:00
Paul Dix 262a988207
Merge pull request #357 from influxdata/pd-cluster_replicate
chore: refactor cluster to use in memory write buffer
2020-10-14 09:43:02 -04:00
Paul Dix 9a345e226c chore: refactor cluster to use in memory write buffer
This refactors cluster to use the in memory write buffer. It removes the injected DatabaseStore as it is no longer needed.
2020-10-14 08:36:49 -04:00
Edd Robinson 6091963d50 test: skip NaN test for now 2020-10-14 13:21:15 +01:00
Edd Robinson 74ed1904c9 feat: fixed encoding for non-null numerics 2020-10-14 13:18:42 +01:00
Andrew Lamb 206df6a325
feat: implement data fusion execution and conversion to series sets (#353) 2020-10-13 16:53:00 -04:00
Paul Dix a80eb0fed3 feat: Store replicated writes
This commit refactors the flatbuffers data types from the wal to a new crate where they can be used by storage, write buffer, and cluster. It also refactors cluster to move the configuration types out to the data types crate so they can be used across storage and elsewhere.

Finally, it adds a new method to store replicated writes on a database in the database trait and implements it.
2020-10-11 15:45:08 -04:00
Paul Dix 996f8905b6 feat: Implement partition templates and key generation
This commit implements partition templates as a struct that can be serialized and deserialzed. It is comprised of parts that can include the table name, a column name and its value, a formatted time, or a string column and regex captures of its value.
2020-10-10 11:32:17 -04:00
Paul Dix cceeebb317
Merge pull request #342 from influxdata/pd-cluster-updates
feat: Update cluster with replication and subscriptions
2020-10-09 07:41:32 -04:00
Andrew Lamb 2b8c04f2b4
chore: Update arrow (again) to pick up latest changes to datafusion (#345) 2020-10-09 07:17:02 -04:00
Andrew Lamb a72e608810
feat: enable simd in arrow (#343) 2020-10-08 11:21:22 -04:00
Paul Dix 05dcbd7236 feat: Update cluster with replication and subscriptions
This updates cluster so that the concept of replication and subscriptions for handling queries are separated. It also adds flatbuffer structure that can be used as a common format for replication.
2020-10-08 08:40:13 -04:00
Andrew Lamb bc5378c7fe
chore: Update arrow to latest version (#335)
* chore: Update arrow to latest version

* fix: Updates needed by new version of datafusion
2020-10-02 14:46:07 -04:00
Andrew Lamb ff29610e44
refactor: Switch back to https://github.com/apache/arrow (#333) 2020-10-01 16:57:12 -04:00
Andrew Lamb 2b98da593b
feat: write_database support for predicates (#326)
* feat: write_database support for predicates

* fix: temporarily pull in arrow fork to pick up fix for ARROW-10136

* fix: Update mutex usage based on PR feedback

* fix: more mutex polish and use OptionExt

* fix: update comments

* fix: rust-fu the table lookup

* fix: update docs

* fix: more idomatic rust types

* fix: better usage of reference types
2020-10-01 14:34:53 -04:00
Edd Robinson a2287acb7c
Merge pull request #330 from influxdata/er/feat/segment-store-shell
feat: Segment Store shell
2020-10-01 14:01:45 +01:00
Edd Robinson bd6b0db691 refactor: address PR feedback 2020-10-01 13:13:32 +01:00
Paul Dix fdc86fd186
feat: add some initial framework for clustering (#329) 2020-09-30 14:41:42 -04:00
Andrew Lamb 8a14896487
chore: update version of datafusion (#324)
* chore: update version of datafusion

* chore: Update interfaces to be async
2020-09-30 08:02:15 -04:00
Edd Robinson 2470bdb975 feat: segment store shell 2020-09-30 11:25:59 +01:00
Andrew Lamb da5c74d3c6
feat: storage interface plans + executor (#318)
* feat: storage interface plans + executor

* refactor: less `expect`

* fix: use more idomatic rust From
2020-09-28 11:41:10 -04:00
Andrew Lamb 0236522dfa
feat: Send panic information to tracing events (#313)
* feat: Send panic information to tracing events

* fix: PR Review improvements

* fix: PR comments

* fix: Apply suggestions from code review

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>

* fix: more fixes

* fix: clarify /cleanup drop more

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2020-09-25 14:55:58 -04:00
Edd Robinson ec1aaa3a47 chore: update dependencies 2020-09-25 17:22:48 +01:00
Edd Robinson 9eee0c2852 refactor: make clippy happy 2020-09-25 10:12:46 +01:00
Edd Robinson c42d2dcd79 refactor: rebase with delorean_arrow 2020-09-25 10:12:46 +01:00
Edd Robinson d0f3cae9b3 feat: add tag values schema API 2020-09-25 10:12:46 +01:00
Edd Robinson 47b2f7940b refactor: spike on arrow encoding 2020-09-25 10:12:46 +01:00
Edd Robinson e5f9c7c574 refactor: add encoding trait 2020-09-25 10:12:46 +01:00
alamb 54e9d38589 chore: update the refs to github 2020-09-25 10:12:46 +01:00
alamb 41899203d9 refactor: implement a prototype datafusion integration layer demonstration 2020-09-25 10:12:46 +01:00
alamb 820277a529 feat: load segments from parquet 2020-09-25 10:12:46 +01:00
alamb acfef35a0e feat: load segments from parquet 2020-09-25 10:12:46 +01:00
alamb 7f815099d0 feat: Read from parquet rather than arrow 2020-09-25 10:12:46 +01:00
Edd Robinson a5a8667a42 feat: group by sorting 2020-09-25 10:12:46 +01:00
Edd Robinson 231f429a56 feat: sort group by measurement 2020-09-25 10:12:46 +01:00
Edd Robinson 2387b7c849 feat: add support for group by aggregate 2020-09-25 10:12:46 +01:00
Edd Robinson aba02cb731 feat: basic store 2020-09-25 10:12:46 +01:00
Andrew Lamb 77f58efca7
chore: update Arrow/Parquet/DataFusion versions, consolidate references into new crate (#309)
* chore: consolidate all arrow/parquet/datafusion dependencies

* chore: update datafusion version
2020-09-24 08:46:54 -04:00
Andrew Lamb 498478c066
refactor: rename delorean_storage_interface to delorean_storage (#308) 2020-09-22 17:18:53 -04:00
Andrew Lamb d0f2902c8d
feat: implement tag_keys and measurement_tag_keys (#307)
* feat: implement tag_keys and measurement_tag_keys

* fix: fix timestamp bound evaluation
2020-09-22 16:42:45 -04:00
Jake Goulding 648d42568d feat: Add a benchmark for restoring the WAL 2020-09-18 16:45:01 -04:00
alamb 2418ee5ab0 refactor: move partitioned_store into its own module 2020-09-18 08:12:19 -04:00
Andrew Lamb 642b1b4370
refactor: move write_buffer to delorean_write_buffer crate (#299) 2020-09-18 08:11:48 -04:00
Andrew Lamb d2c24ef7af
refactor: pull storage interface into delorean_storage_interface (#298) 2020-09-18 07:58:19 -04:00
Andrew Lamb 5fe3bfd53c
refactor: extract WalDetails into delorean_wal_writer crate (#297) 2020-09-18 07:47:37 -04:00
Carol (Nichols || Goulding) 596c987956 feat: Compress WAL entries with Snappy
Fixes #276.
2020-09-14 09:42:54 -04:00
Andrew Lamb 82d5f485c3
test: traits for database and tests for http handler (#284)
* test: traits for database and tests for http handler

* refactor: Use generics and trait bounds instead of trait objects

* refactor: Replace trait objects with an associated type

* refactor: Extract an associated Error type on the Database traits

* refactor: Remove some explicit conversions to_string that Snafu takes care of

* docs: add comments

* refactor: move traits into storage module

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
2020-09-11 17:42:00 -04:00
alamb 9b9ff484bb fix: implement escaping 2020-09-11 17:14:35 -04:00
Paul Dix 8ed3a1b440
feat: Initial prototype of WriteBuffer and WAL (#271)
This is the initial prototype of the WriteBuffer and WAL. This does the following:

* accepts a slice of ParsedLine into the DB
* writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported
* persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch)
* has a method to return a table from the buffer as an Arrow RecordBatch
* recovers the WAL after the database is closed and opened back up again
* has a single test that covers the end-to-end from the DB side
* It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration.
* hooked up to the v2 HTTP write API
* hooked up to a read API which will execute a SQL query against the data in the buffer

This includes a refactor of the WAL:

Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes.

This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic.

Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
2020-09-08 14:12:16 -04:00
Carol (Nichols || Goulding) d59702ec79 feat: Make the create bucket HTTP API match the Influx 2.0 API
The `/api/v2/create_bucket` API was delorean-specific for testing
purposes. This change makes it match the [Influx 2.0 API][influx] and
adds a method to the client for creating buckets.

The client will always send an empty array of `retentionRules` because
that is a required parameter for the Influx API. Delorean always ignores
`retentionRules`. The `description` and `rp` parameters are optional and
are never sent.

[influx]: https://v2.docs.influxdata.com/v2.0/api/#operation/PostBuckets

I believe the gRPC create bucket is also delorean-specific and perhaps
not needed, but I'm leaving it in for now with a note.
2020-08-12 10:08:32 -04:00
Edd Robinson 21c0155271 fix: improve pivot for certain sorts 2020-08-04 21:33:58 +01:00
Carol (Nichols || Goulding) 19159138cc fix: Turn off default features of parquet so arrow-flight doesn't repeatedly rebuild
Fixes #261
2020-07-30 09:43:12 -04:00
alamb f946e84a12 chore: revert upgrade parquet dependency to 1.0.0"
This reverts commit 25259b4c99.
2020-07-30 07:02:53 -04:00
alamb 25259b4c99 chore: upgrade parquet dependency to 1.0.0 2020-07-28 15:11:35 -04:00
Carol (Nichols || Goulding) 0709f90040 test: Add a mock server test in the client crate for the newline bug 2020-07-27 14:10:54 -04:00
Jake Goulding b72c2ffd73
Merge pull request #253 from influxdata/client-dynamic-data-point 2020-07-24 09:50:11 -04:00
Carol (Nichols || Goulding) c179a7e8b2 fix: Remove generate/seed utilities
These are going to be redone in the fusion repo.
2020-07-22 17:15:30 -04:00
Jake Goulding f8304e6e6b feat: Add a dynamic type to construct data points for ingestion 2020-07-22 17:03:29 -04:00
Andrew Lamb 143c350ecb
Merge pull request #250 from influxdata/alamb/feat-multi-col-stats
feat: Update stats command to handle directories of files
2020-07-20 16:48:31 -04:00
alamb ca1bd79902 feat: Update stats command to handle directories of files 2020-07-17 16:47:11 -04:00
Carol (Nichols || Goulding) 668aefae9b feat: Implement a rudimentary write API in the influx client 2020-07-17 10:28:19 -04:00
Carol (Nichols || Goulding) 7ed24241b5 feat: Set up an InfluxData 2.0 client crate 2020-07-17 10:27:33 -04:00
Carol (Nichols || Goulding) b3a16c080f feat: Update croaring
Jake dug into why the end-to-end tests fail with delorean running in the
Docker image I built, and it appears to be a crash with an illegal
instruction from CRoaring.

We think it's this issue: https://github.com/saulius/croaring-rs/pull/62
which was merged and released, so let's try updating CRoaring.
2020-07-08 08:49:28 -04:00
Edd Robinson 831f647b9d feat: implement escaped tsm key parsing 2020-07-04 08:46:45 -04:00
Edd Robinson 06e9fae845 fix: ignore conflicting field types
Fixes #205.
2020-06-30 18:08:05 +01:00
Andrew Lamb 97a5eb7e19
Merge pull request #197 from influxdata/alamb/log-requests
feat: Log gRPC calls using trace crate, allow custom log levels
2020-06-30 10:47:11 -04:00
alamb 283d6691c6 feat: enable rpc debug tracing, tweaked logging levels, respect RUST_FMT env var 2020-06-29 09:59:22 -04:00
Jake Goulding ad1e3d04bb feat: Add a local filesystem implementation of the object store 2020-06-29 08:48:48 -04:00
Edd Robinson d15256e0e7 refactor: address PR feedback 2020-06-26 12:08:42 +01:00
Edd Robinson 9d889828c3 fix: ensure all rows are emitted for each column 2020-06-26 11:50:37 +01:00
alamb 68ce351a3a refactor: remove direct parquet dependency from delorean_ingest 2020-06-23 16:58:31 -04:00
Carol (Nichols || Goulding) d7dbf061cb feat: Implement String encoding/decoding
Fixes #148.
2020-06-22 15:15:34 -04:00
Edd Robinson 106bd69b5a feat: support converting from TSM->Parquet 2020-06-22 18:56:17 +01:00
Edd Robinson 85e0b4ec16 refactor: hoist tsm reader into own crate 2020-06-22 18:56:17 +01:00
Edd Robinson ac7bb6bf68 refactor: make Packer generic 2020-06-22 11:24:29 +01:00
Jake Goulding bfb0213ac3 feat: Update Rusoto to allow streaming data on uploads 2020-06-19 09:18:44 -04:00
Andrew Lamb 8185c80c03
fix: fix logical merge conflict (#169) 2020-06-18 18:51:25 -04:00
Andrew Lamb ae37548980
feat: Add support for parsing string values in line protocol parser (#155)
* feat: add debug logging on parser error

* feat: Add support for parsing string values in line protocol parser

* fix: Fix comment

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-06-18 12:44:17 -04:00
Andrew Lamb cf248f2143
feat: upgrade to latest arrow / byteorder (#154) 2020-06-17 12:50:23 -04:00
Carol (Nichols || Goulding) d83c410a5c feat: Update to the released version of cloud-storage
My submitted API improvements got merged in!
2020-06-10 17:23:52 -04:00
Carol (Nichols || Goulding) d3283b1096 feat: Object storage in S3 and GCS 2020-06-10 17:23:52 -04:00
Andrew Lamb faf3f534ac
refactor: move all dstool code into delorean binary (#131)
* refactor: move all dstool code into delorean binary

* fix: Move code/mods to make it compile and run

* fix: warn if db dir does not exist

* refactor: Match argument subcommands w/ more idomatic  rust

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

fix: restore hyper logging

fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: update expected code

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-06-10 16:04:46 -04:00
Andrew Lamb 0415b233ec
refactor: Instantiate the table writer on demand (#128)
* refactor: instantiate ParquetWriter on demand, prep for multi measurements

* fix: doc test

* fix: update names
2020-06-09 16:11:42 -04:00
Andrew Lamb 986e12d62a
refactor: Rename crate line_protocol_schema --> delorean_table_schema (#129)
* refactor: Rename crate line_protocol_schema --> delorean_table_schema

* fix: fmt
2020-06-09 11:56:16 -04:00
Andrew Lamb f1a3058b24
feat: Add file / metadata inspection + dumping with dstool (#112)
* feat: Add file / metadata inspection + dumping

* fix: apply some PR review comments

* fix: apply suggestions from code review

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>

* feat: Add tests, rearrange code into modules, add gzip aware interface

* fix: fix comment and test

* fix: test output and fmt

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2020-06-09 10:10:55 -04:00
Andrew Lamb 8475b6d183
feat: Add parquet writer, hook up conversion in dstool (#124)
* feat: Add parquet writer, hook up conversion in dstool

* fix: use bigger executor for test

* fix: less cloning

* fix: make unsupported messages less pejorative

* fix: fmt

* fix: Rename writer and do not require std::File, add example

* fix: clippy and fmt

* fix: remove unnecessary module in end to end tests

* fix: remove strange use of tempfile

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: cleanup use

* fix: Use more specific error messages

* fix: comment tweak

* fix: touchup temp path creation

* fix: clippy!

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-06-08 16:25:24 -04:00
Andrew Lamb ca9f9d4cae
feat: Add column packing code (#114)
* feat: Add column packing code

* fix: remove dependency on assert_approx_equal in favor of delorean_test_helpers

* fix: Cleanups from pr comments

* fix: Apply suggestions from code review

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: more cleanup per code review

* fix: pr comments

* fix: remove explict string creation from caller

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-06-06 06:04:41 -04:00
Andrew Lamb 2200def8ea
feat: Use rust nightly (#123) 2020-06-05 17:45:44 -04:00
Jake Goulding 68fb580b43
style: Re-enable the elided lifetimes lint and move generated types to their own crate (#119)
* refactor: rename the module containing generated types

The nested `delorean` was confusing anyway, and this will make more
sense when we extract a new crate.

* refactor: Move the generated types to their own crate

This allows us to have more lax warnings in that crate alone, keeping
the main crate more strict.

* style: Re-enable elided lifetimes lint in the main crate
2020-06-05 16:22:27 -04:00
Edd Robinson 887ffd5977 refactor: remove lifetime to make index re-usable 2020-06-04 14:36:43 +01:00
Edd Robinson e3db077121 feat: add API for series key information 2020-06-04 14:36:43 +01:00
Edd Robinson 413738a264 feat: support org and bucket ID in entries 2020-06-04 14:36:43 +01:00
Andrew Lamb 234b2f5752
feat: Line Protocol Schema extractor (#108)
* feat: schema inference from iterator of parsed lines

* fix: Clean up error handing even more

* fix: fmt

* fix: make a sacrifice to the clippy gods
2020-06-03 18:29:57 -04:00
Andrew Lamb 5d2c5de39d
feat: Structs to represent line protocol schema (#103)
* feat: Structs to represent line protocol schema

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2020-06-03 08:39:35 -04:00
Andrew Lamb 18b05ce9ef
fix: move test of dstool to its delorean_storage_tool package (#107) 2020-06-02 16:10:30 -04:00