Commit Graph

1110 Commits (a3b88d55067c3b022e242ebc641cb563f04c4e4e)

Author SHA1 Message Date
Carol (Nichols || Goulding) 4faf5f04dc refactor: Implement Default on Db to make creation easier 2020-10-16 13:57:16 -04:00
Andrew Lamb 6a0de3a4c9
fix: add logging to read_filter rpc route (#363) 2020-10-16 09:37:49 -04:00
Andrew Lamb dc4898e1e4
feat: conversion between SeriesSets and ReadResponses (#362)
* feat: conversion between SeriesSets and ReadResponses

* fix: Address PR review comments 1

* fix: Address PR review comments 1
2020-10-16 06:37:50 -04:00
Paul Dix 063cc4cee8
Merge pull request #361 from influxdata/pd-cluster_replicate
feat: Implement simple replication to host groups
2020-10-15 18:15:57 -04:00
Paul Dix ac072d22a4 feat: Implement simple replication to host groups
This looks at the database rules and replicates writes to the replicate host groups specified. Later commits will add hashing based on partition key and handling replication errors from remote servers.
2020-10-15 17:47:55 -04:00
Andrew Lamb 0ef76db208
feat: implement series_query for write buffer database, tests for same (#360)
* feat: implement series_query for write buffer database, tests for same

* fix: fixup comments

* fix: sort field columns too
2020-10-15 17:23:14 -04:00
Paul Dix 262a988207
Merge pull request #357 from influxdata/pd-cluster_replicate
chore: refactor cluster to use in memory write buffer
2020-10-14 09:43:02 -04:00
Edd Robinson 7924d081b4
Merge pull request #344 from influxdata/er/feat/fixed-encoding
feat: fixed-width column encodings
2020-10-14 14:02:13 +01:00
Paul Dix 9a345e226c chore: refactor cluster to use in memory write buffer
This refactors cluster to use the in memory write buffer. It removes the injected DatabaseStore as it is no longer needed.
2020-10-14 08:36:49 -04:00
Edd Robinson 6091963d50 test: skip NaN test for now 2020-10-14 13:21:15 +01:00
Edd Robinson 08a603ce7f refactor: PR feedback 2020-10-14 13:18:43 +01:00
Edd Robinson c137a2d5ab refactor: add module documentation 2020-10-14 13:18:43 +01:00
Carol (Nichols || Goulding) 0bfaeedc40 refactor: use PartialOrd fns instead of Orderings 2020-10-14 13:18:43 +01:00
Edd Robinson 085c3d9152 test: add test for min/max on NaN 2020-10-14 13:18:43 +01:00
Edd Robinson 35f9169e5a refactor: bench typo 2020-10-14 13:18:43 +01:00
Edd Robinson e05add3fda refactor: change name of arrow encoding 2020-10-14 13:18:43 +01:00
Edd Robinson a72dfe3eef refactor: forgot single precision floats 2020-10-14 13:18:43 +01:00
Edd Robinson f6cf3a66f2 test: add null tests 2020-10-14 13:18:43 +01:00
Edd Robinson 25a9250138 refactor: add some initial benchmarks for sum 2020-10-14 13:18:42 +01:00
Edd Robinson fa38c0a981 feat: nullable fixed width encoding with arrow 2020-10-14 13:18:42 +01:00
Edd Robinson 74ed1904c9 feat: fixed encoding for non-null numerics 2020-10-14 13:18:42 +01:00
Paul Dix 0d6bfd2f29
Merge pull request #356 from influxdata/pd-refactor_cluster_write_buffer
feat: Make table return all columns if none specified for arrow batch
2020-10-14 07:34:09 -04:00
Paul Dix dbc6b7b2d6 feat: Make table return all columns if none specified for arrow batch 2020-10-14 07:28:58 -04:00
Andrew Lamb 1326c831c6
docs: Motivate the use of Arc in SeriesSet and SeriesSetPlan (#354) 2020-10-13 18:11:32 -04:00
Andrew Lamb 206df6a325
feat: implement data fusion execution and conversion to series sets (#353) 2020-10-13 16:53:00 -04:00
Andrew Lamb 246a3d4400
docs: Update comments (#352) 2020-10-12 20:04:34 -04:00
Andrew Lamb 80088ffe37
feat: gRPC plumbing + interface structures for read_filter (#351)
* feat: gRPC plumbing + support structures for read_filter

* fix: cleanup comments
2020-10-12 14:12:53 -04:00
Paul Dix befd386088
Merge pull request #347 from influxdata/pd-partition-key-generation
feat: Implement partition templates and key generation
2020-10-12 08:26:11 -04:00
Paul Dix 77e732cc69
Merge pull request #349 from influxdata/pd-replicate
feat: Store replicated writes
2020-10-12 08:17:42 -04:00
Paul Dix a80eb0fed3 feat: Store replicated writes
This commit refactors the flatbuffers data types from the wal to a new crate where they can be used by storage, write buffer, and cluster. It also refactors cluster to move the configuration types out to the data types crate so they can be used across storage and elsewhere.

Finally, it adds a new method to store replicated writes on a database in the database trait and implements it.
2020-10-11 15:45:08 -04:00
Paul Dix 996f8905b6 feat: Implement partition templates and key generation
This commit implements partition templates as a struct that can be serialized and deserialzed. It is comprised of parts that can include the table name, a column name and its value, a formatted time, or a string column and regex captures of its value.
2020-10-10 11:32:17 -04:00
Paul Dix cceeebb317
Merge pull request #342 from influxdata/pd-cluster-updates
feat: Update cluster with replication and subscriptions
2020-10-09 07:41:32 -04:00
Andrew Lamb 2b8c04f2b4
chore: Update arrow (again) to pick up latest changes to datafusion (#345) 2020-10-09 07:17:02 -04:00
Andrew Lamb aaeb0d4c84
refactor: implement automatic error conversion for errors that do not have lots of context (#341)
* refactor: implement automatic error conversion for errors that do not have lots of context

* fix: implement code review suggestions
2020-10-08 11:21:54 -04:00
Andrew Lamb a72e608810
feat: enable simd in arrow (#343) 2020-10-08 11:21:22 -04:00
Paul Dix 05dcbd7236 feat: Update cluster with replication and subscriptions
This updates cluster so that the concept of replication and subscriptions for handling queries are separated. It also adds flatbuffer structure that can be used as a common format for replication.
2020-10-08 08:40:13 -04:00
Andrew Lamb 5400c55b2a
refactor: apply timestamp predicate in visit code (#340) 2020-10-07 12:33:04 -04:00
Andrew Lamb 9a81bf4d72
feat: implement column_values for write buffer database (#339) 2020-10-07 10:12:28 -04:00
Andrew Lamb 3ba1a95795
refactor: extract "traverse the write buffer structure" into a visitor trait/pattern (#338) 2020-10-06 17:08:46 -04:00
Andrew Lamb 3d670fb556
feat: Implement gRPC routes tag_values and measurement_tag_values (#337) 2020-10-06 17:07:03 -04:00
Andrew Lamb bc5378c7fe
chore: Update arrow to latest version (#335)
* chore: Update arrow to latest version

* fix: Updates needed by new version of datafusion
2020-10-02 14:46:07 -04:00
Paul Dix 1b69a5a79c
refactor: WriteBuffer database and WAL Flatbuffers (#331)
* chore: Refactor write buffer WAL

This commit refactors the WAL to remove partition events and to collapse rows into a single write buffer entry.
This further simplifies the WAL by removing WriteBufferBatch.
Finally, this removes the concept of a partition generation as that is currently not used.

* refactor: WriteBuffer database and WAL Flatbuffers

This refactor updates the WriteBuffer write path signficantly. At the public API it takes parsed lines, but then immediately converts them over to a built Flatbuffer byte array, which has also been signficantly refactored.

The Flatbuffer structure has been updated so that a WriteBufferBatch contains a vec of WriteBufferEntry. Each of those entries corresponds to a collection of data that is bound for a single partition. The generated partition key is now kept as part of this entry.

Within the WriteBufferEntry you now have a vec of TableWriteBatch which have the table name and a vec of Row. This pulls the table name out of the row, elminating redundancy for writes that have multiple rows being written into the same table.

The database now has methods to accept the Flatbuffer WriteBufferEntry with updates down the line to Partition and Table.

This also has a nice little performance bump for WAL restore:

wal-restoration/restore_single_entry_single_partition
                        time:   [684.51 us 688.45 us 692.53 us]
                        thrpt:  [1.4440 Melem/s 1.4525 Melem/s 1.4609 Melem/s]
                 change:
                        time:   [-55.913% -55.351% -54.800%] (p = 0.00 < 0.05)
                        thrpt:  [+121.24% +123.97% +126.82%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

wal-restoration/restore_multiple_entry_multiple_partition
                        time:   [8.7483 ms 8.8964 ms 9.0815 ms]
                        thrpt:  [1.3214 Melem/s 1.3489 Melem/s 1.3717 Melem/s]
                 change:
                        time:   [-55.952% -55.166% -54.213%] (p = 0.00 < 0.05)
                        thrpt:  [+118.40% +123.04% +127.02%]
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

* fix: fmt

Co-authored-by: alamb <andrew@nerdnetworks.org>
2020-10-02 13:52:00 -04:00
Andrew Lamb 45c4f1e24e
refactor: make table_names API consistent with tag_keys API, other cleanups (#327) 2020-10-02 09:42:06 -04:00
Andrew Lamb 0a48c04a9b
refactor: improve predicate conversion code (#325) 2020-10-01 17:26:39 -04:00
Andrew Lamb ff29610e44
refactor: Switch back to https://github.com/apache/arrow (#333) 2020-10-01 16:57:12 -04:00
Andrew Lamb 3d7d4111be
fix: Upgrade the resource class used to run CI tests (#332) 2020-10-01 14:56:32 -04:00
Andrew Lamb 2b98da593b
feat: write_database support for predicates (#326)
* feat: write_database support for predicates

* fix: temporarily pull in arrow fork to pick up fix for ARROW-10136

* fix: Update mutex usage based on PR feedback

* fix: more mutex polish and use OptionExt

* fix: update comments

* fix: rust-fu the table lookup

* fix: update docs

* fix: more idomatic rust types

* fix: better usage of reference types
2020-10-01 14:34:53 -04:00
Edd Robinson a2287acb7c
Merge pull request #330 from influxdata/er/feat/segment-store-shell
feat: Segment Store shell
2020-10-01 14:01:45 +01:00
Edd Robinson bd6b0db691 refactor: address PR feedback 2020-10-01 13:13:32 +01:00
Edd Robinson 30c1c9c615
refactor: Update delorean_segment_store/src/table.rs 2020-10-01 12:16:36 +01:00